Oncode Institute and Department of Biochemistry, The Netherlands Cancer Institute, Amsterdam, the Netherlands.
Nat Methods. 2023 Feb;20(2):205-213. doi: 10.1038/s41592-022-01685-y. Epub 2022 Nov 24.
Artificial intelligence-based protein structure prediction approaches have had a transformative effect on biomolecular sciences. The predicted protein models in the AlphaFold protein structure database, however, all lack coordinates for small molecules, essential for molecular structure or function: hemoglobin lacks bound heme; zinc-finger motifs lack zinc ions essential for structural integrity and metalloproteases lack metal ions needed for catalysis. Ligands important for biological function are absent too; no ADP or ATP is bound to any of the ATPases or kinases. Here we present AlphaFill, an algorithm that uses sequence and structure similarity to 'transplant' such 'missing' small molecules and ions from experimentally determined structures to predicted protein models. The algorithm was successfully validated against experimental structures. A total of 12,029,789 transplants were performed on 995,411 AlphaFold models and are available together with associated validation metrics in the alphafill.eu databank, a resource to help scientists make new hypotheses and design targeted experiments.
基于人工智能的蛋白质结构预测方法已经对生物分子科学产生了变革性的影响。然而,AlphaFold 蛋白质结构数据库中的预测蛋白质模型都缺乏小分子的坐标,这些小分子对于分子结构或功能至关重要:血红蛋白缺乏结合的血红素;锌指基序缺乏结构完整性所必需的锌离子,金属蛋白酶缺乏催化所需的金属离子。对于生物学功能很重要的配体也不存在;没有 ADP 或 ATP 与任何 ATP 酶或激酶结合。在这里,我们提出了 AlphaFill,这是一种算法,它使用序列和结构相似性将这些“缺失”的小分子和离子从实验确定的结构“移植”到预测的蛋白质模型中。该算法已经成功地针对实验结构进行了验证。总共对 995411 个 AlphaFold 模型进行了 12029789 次移植,并在 alphafill.eu 数据库中提供了相关的验证指标,该数据库是一个帮助科学家提出新假设和设计有针对性实验的资源。