机器学习方法在蛋白质结构质量评估中的应用。

Machine Learning Approaches for Quality Assessment of Protein Structures.

机构信息

Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau, China.

出版信息

Biomolecules. 2020 Apr 17;10(4):626. doi: 10.3390/biom10040626.

DOI:10.3390/biom10040626

PMID:32316682

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7226485/

Abstract

Protein structures play a very important role in biomedical research, especially in drug discovery and design, which require accurate protein structures in advance. However, experimental determinations of protein structure are prohibitively costly and time-consuming, and computational predictions of protein structures have not been perfected. Methods that assess the quality of protein models can help in selecting the most accurate candidates for further work. Driven by this demand, many structural bioinformatics laboratories have developed methods for estimating model accuracy (EMA). In recent years, EMA by machine learning (ML) have consistently ranked among the top-performing methods in the community-wide CASP challenge. Accordingly, we systematically review all the major ML-based EMA methods developed within the past ten years. The methods are grouped by their employed ML approach-support vector machine, artificial neural networks, ensemble learning, or Bayesian learning-and their significances are discussed from a methodology viewpoint. To orient the reader, we also briefly describe the background of EMA, including the CASP challenge and its evaluation metrics, and introduce the major ML/DL techniques. Overall, this review provides an introductory guide to modern research on protein quality assessment and directions for future research in this area.

摘要

蛋白质结构在生物医学研究中起着非常重要的作用，特别是在药物发现和设计中，这些都需要事先获得准确的蛋白质结构。然而，实验确定蛋白质结构的成本和时间都非常高，并且计算预测蛋白质结构还不够完善。评估蛋白质模型质量的方法可以帮助选择最准确的候选者进行进一步的研究。出于这种需求，许多结构生物信息学实验室已经开发了评估模型准确性的方法（EMA）。近年来，基于机器学习（ML）的 EMA 在 CASP 挑战赛中一直是表现最好的方法之一。因此，我们系统地回顾了过去十年中开发的所有主要基于 ML 的 EMA 方法。这些方法按其采用的 ML 方法（支持向量机、人工神经网络、集成学习或贝叶斯学习）进行分组，并从方法学的角度讨论了它们的重要性。为了让读者了解背景知识，我们还简要介绍了 EMA 的背景，包括 CASP 挑战赛及其评估指标，并介绍了主要的 ML/DL 技术。总的来说，这篇综述为蛋白质质量评估的现代研究提供了一个入门指南，并为该领域的未来研究指明了方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d688/7226485/a28ac5c558bd/biomolecules-10-00626-g001.jpg

相似文献

Machine Learning Approaches for Quality Assessment of Protein Structures.机器学习方法在蛋白质结构质量评估中的应用。

Biomolecules. 2020 Apr 17;10(4):626. doi: 10.3390/biom10040626.

DeepQA: improving the estimation of single protein model quality with deep belief networks.深度问答：利用深度信念网络改进单一蛋白质模型质量的评估

BMC Bioinformatics. 2016 Dec 5;17(1):495. doi: 10.1186/s12859-016-1405-y.

Critical assessment of methods of protein structure prediction (CASP)-Round XV.蛋白质结构预测方法的关键评估（CASP）-第十五轮。

Proteins. 2023 Dec;91(12):1539-1549. doi: 10.1002/prot.26617. Epub 2023 Nov 2.

Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.使用多种药物发现数据集比较深度学习与多种机器学习方法和指标。

Mol Pharm. 2017 Dec 4;14(12):4462-4475. doi: 10.1021/acs.molpharmaceut.7b00578. Epub 2017 Nov 13.

Machine learning methods for protein structure prediction.机器学习方法在蛋白质结构预测中的应用。

IEEE Rev Biomed Eng. 2008;1:41-9. doi: 10.1109/RBME.2008.2008239.

ProteinNet: a standardized data set for machine learning of protein structure.ProteinNet：用于蛋白质结构机器学习的标准化数据集。

BMC Bioinformatics. 2019 Jun 11;20(1):311. doi: 10.1186/s12859-019-2932-0.

Machine-learning techniques for the prediction of protein-protein interactions.基于机器学习的蛋白质-蛋白质相互作用预测技术。

J Biosci. 2019 Sep;44(4).

Machine Learning-Based Software Defect Prediction for Mobile Applications: A Systematic Literature Review.基于机器学习的移动应用程序软件缺陷预测：系统文献综述。

Sensors (Basel). 2022 Mar 26;22(7):2551. doi: 10.3390/s22072551.

A Benchmark Dataset for Evaluating Practical Performance of Model Quality Assessment of Homology Models.一个用于评估同源模型质量评估实际性能的基准数据集。

Bioengineering (Basel). 2022 Mar 15;9(3):118. doi: 10.3390/bioengineering9030118.

Machine Learning Methods in Computational Toxicology.计算毒理学中的机器学习方法

Methods Mol Biol. 2018;1800:119-139. doi: 10.1007/978-1-4939-7899-1_5.

引用本文的文献

Neural Networks for Predicting and Classifying Antimicrobial Resistance Sequences in Porphyromonas gingivalis.用于预测和分类牙龈卟啉单胞菌抗微生物耐药性序列的神经网络

Int Dent J. 2025 Jul 5;75(5):100890. doi: 10.1016/j.identj.2025.100890.

Structural studies of Parvoviridae capsid assembly and evolution: implications for novel AAV vector design.细小病毒科衣壳组装与进化的结构研究：对新型腺相关病毒载体设计的启示

Front Artif Intell. 2025 Apr 2;8:1559461. doi: 10.3389/frai.2025.1559461. eCollection 2025.

A Methylation Diagnostic Model Based on Random Forests and Neural Networks for Asthma Identification.基于随机森林和神经网络的哮喘识别甲基化诊断模型。

Comput Math Methods Med. 2022 Sep 28;2022:2679050. doi: 10.1155/2022/2679050. eCollection 2022.

An Overview of Alphafold's Breakthrough.阿尔法折叠的突破概述。

Front Artif Intell. 2022 Jun 9;5:875587. doi: 10.3389/frai.2022.875587. eCollection 2022.

Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network.基于半监督学习和图卷积神经网络的化学毒性预测

J Cheminform. 2021 Nov 27;13(1):93. doi: 10.1186/s13321-021-00570-8.

Deep Learning-Based Advances in Protein Structure Prediction.基于深度学习的蛋白质结构预测进展。

Int J Mol Sci. 2021 May 24;22(11):5553. doi: 10.3390/ijms22115553.

QUARTERplus: Accurate disorder predictions integrated with interpretable residue-level quality assessment scores.QUARTERplus：与可解释的残基水平质量评估分数相结合的准确疾病预测。

Comput Struct Biotechnol J. 2021 Apr 27;19:2597-2606. doi: 10.1016/j.csbj.2021.04.066. eCollection 2021.

本文引用的文献

Improved protein structure prediction using potentials from deep learning.利用深度学习势进行蛋白质结构预测的改进。

Nature. 2020 Jan;577(7792):706-710. doi: 10.1038/s41586-019-1923-7. Epub 2020 Jan 15.

QMEANDisCo-distance constraints applied on model quality estimation.QMEANDisCo 距离约束应用于模型质量评估。

Bioinformatics. 2020 Mar 1;36(6):1765-1771. doi: 10.1093/bioinformatics/btz828.

Protein structure prediction using multiple deep neural networks in the 13th Critical Assessment of Protein Structure Prediction (CASP13).使用多个深度神经网络进行蛋白质结构预测在第十三届蛋白质结构预测关键评估 (CASP13) 中。

Proteins. 2019 Dec;87(12):1141-1148. doi: 10.1002/prot.25834.

Recent developments in deep learning applied to protein structure prediction.深度学习在蛋白质结构预测中的最新进展。

Proteins. 2019 Dec;87(12):1179-1189. doi: 10.1002/prot.25824. Epub 2019 Oct 14.

Critical assessment of methods of protein structure prediction (CASP)-Round XIII.蛋白质结构预测方法的关键评估（CASP）-第十三轮。

Proteins. 2019 Dec;87(12):1011-1020. doi: 10.1002/prot.25823. Epub 2019 Oct 23.

Protein model accuracy estimation based on local structure quality assessment using 3D convolutional neural network.基于局部结构质量评估的 3D 卷积神经网络的蛋白质模型精度估计。

PLoS One. 2019 Sep 5;14(9):e0221347. doi: 10.1371/journal.pone.0221347. eCollection 2019.

Assessment of protein model structure accuracy estimation in CASP13: Challenges in the era of deep learning.评估 CASP13 中蛋白质模型结构准确性估计：深度学习时代的挑战。

Proteins. 2019 Dec;87(12):1351-1360. doi: 10.1002/prot.25804. Epub 2019 Aug 30.

Advances in protein structure prediction and design.蛋白质结构预测和设计的进展。

Nat Rev Mol Cell Biol. 2019 Nov;20(11):681-697. doi: 10.1038/s41580-019-0163-x. Epub 2019 Aug 15.

rawMSA: End-to-end Deep Learning using raw Multiple Sequence Alignments.rawMSA：使用原始多序列比对的端到端深度学习。

PLoS One. 2019 Aug 15;14(8):e0220182. doi: 10.1371/journal.pone.0220182. eCollection 2019.

Distance-based protein folding powered by deep learning.基于深度学习的距离相关蛋白质折叠。

Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865. doi: 10.1073/pnas.1821309116. Epub 2019 Aug 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

机器学习方法在蛋白质结构质量评估中的应用。

Machine Learning Approaches for Quality Assessment of Protein Structures.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献