Castillo Sandra, Ollila Osmo Henri Samuli
VTT Technical Research Centre of Finland Ltd, Espoo, Otaniemi, 02044 VTT, Finland.
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf424.
Finding proteins with specific functions by mining modern databases can potentially lead to substantial advancements in wide range of fields, from medicine and biotechnology to material science. Currently available algorithms enable mining of proteins based on their sequence or structure. However, activities of many proteins, such as enzymes and drug targets, are dictated by active site residues and their surroundings rather than the overall structure or sequence of a protein.
We introduce ActSeek-a computer vision-inspired fast program-that searches structural databases for proteins with active sites similar to the seed protein. ActSeek is implemented to mine proteins with desired active site environments from the Alphafold database. The potential of ActSeek to find innovative solutions to the world's most pressing challenges is demonstrated by finding enzymes that may be used to produce biodegradable plastics or degrade plastics, as well as potential off-targets for common drug molecules.
ActSeek source code is available in https://github.com/vttresearch/ActSeek under Non-Commercial License Agreement.
通过挖掘现代数据库来寻找具有特定功能的蛋白质,有望在从医学、生物技术到材料科学等广泛领域取得重大进展。目前可用的算法能够基于蛋白质的序列或结构进行挖掘。然而,许多蛋白质的活性,如酶和药物靶点,是由活性位点残基及其周围环境决定的,而不是由蛋白质的整体结构或序列决定的。
我们引入了ActSeek——一个受计算机视觉启发的快速程序——它在结构数据库中搜索与种子蛋白具有相似活性位点的蛋白质。ActSeek被用于从AlphaFold数据库中挖掘具有所需活性位点环境的蛋白质。通过找到可用于生产可生物降解塑料或降解塑料的酶以及常见药物分子的潜在脱靶标,证明了ActSeek在为世界上最紧迫的挑战寻找创新解决方案方面的潜力。
ActSeek的源代码可在https://github.com/vttresearch/ActSeek上获取,遵循非商业许可协议。