• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用机器学习预测 Coot 中蛋白质模型的正确性。

Predicting protein model correctness in Coot using machine learning.

机构信息

Department of Chemistry, University of York, York YO10 5DD, United Kingdom.

出版信息

Acta Crystallogr D Struct Biol. 2020 Aug 1;76(Pt 8):713-723. doi: 10.1107/S2059798320009080. Epub 2020 Jul 27.

DOI:10.1107/S2059798320009080
PMID:32744253
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7397494/
Abstract

Manually identifying and correcting errors in protein models can be a slow process, but improvements in validation tools and automated model-building software can contribute to reducing this burden. This article presents a new correctness score that is produced by combining multiple sources of information using a neural network. The residues in 639 automatically built models were marked as correct or incorrect by comparing them with the coordinates deposited in the PDB. A number of features were also calculated for each residue using Coot, including map-to-model correlation, density values, B factors, clashes, Ramachandran scores, rotamer scores and resolution. Two neural networks were created using these features as inputs: one to predict the correctness of main-chain atoms and the other for side chains. The 639 structures were split into 511 that were used to train the neural networks and 128 that were used to test performance. The predicted correctness scores could correctly categorize 92.3% of the main-chain atoms and 87.6% of the side chains. A Coot ML Correctness script was written to display the scores in a graphical user interface as well as for the automatic pruning of chains, residues and side chains with low scores. The automatic pruning function was added to the CCP4i2 Buccaneer automated model-building pipeline, leading to significant improvements, especially for high-resolution structures.

摘要

手动识别和纠正蛋白质模型中的错误可能是一个缓慢的过程,但验证工具和自动化建模软件的改进有助于减轻这一负担。本文提出了一种新的正确性评分方法,该方法通过使用神经网络结合多个来源的信息来生成。通过将自动构建的 639 个模型的坐标与 PDB 中存储的坐标进行比较,将这些模型中的残基标记为正确或错误。还使用 Coot 为每个残基计算了许多特征,包括图谱与模型的相关性、密度值、B 因子、冲突、Ramachandran 分数、构象分数和分辨率。使用这些特征作为输入创建了两个神经网络:一个用于预测主链原子的正确性,另一个用于预测侧链原子的正确性。将 639 个结构分为 511 个用于训练神经网络和 128 个用于测试性能。预测的正确性得分可以正确地对 92.3%的主链原子和 87.6%的侧链原子进行分类。编写了一个 Coot ML 正确性脚本,以便在图形用户界面中显示分数,并用于自动修剪低得分的链、残基和侧链。该自动修剪功能已被添加到 CCP4i2 Buccaneer 自动化建模管道中,从而显著提高了模型质量,尤其是对于高分辨率结构。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/b7b62c374db6/d-76-00713-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/e6b7ccda6d62/d-76-00713-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/12ab3d5576c5/d-76-00713-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/9ee3a3dbf4b5/d-76-00713-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/9d0083e087d7/d-76-00713-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/ddd10757d6f5/d-76-00713-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/fc6858e64580/d-76-00713-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/36e7536f7571/d-76-00713-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/a8f20ec0eedd/d-76-00713-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/b7b62c374db6/d-76-00713-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/e6b7ccda6d62/d-76-00713-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/12ab3d5576c5/d-76-00713-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/9ee3a3dbf4b5/d-76-00713-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/9d0083e087d7/d-76-00713-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/ddd10757d6f5/d-76-00713-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/fc6858e64580/d-76-00713-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/36e7536f7571/d-76-00713-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/a8f20ec0eedd/d-76-00713-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0bb6/7397494/b7b62c374db6/d-76-00713-fig9.jpg

相似文献

1
Predicting protein model correctness in Coot using machine learning.使用机器学习预测 Coot 中蛋白质模型的正确性。
Acta Crystallogr D Struct Biol. 2020 Aug 1;76(Pt 8):713-723. doi: 10.1107/S2059798320009080. Epub 2020 Jul 27.
2
ModelCraft: an advanced automated model-building pipeline using Buccaneer.ModelCraft:一个使用 Buccaneer 的高级自动化模型构建流水线。
Acta Crystallogr D Struct Biol. 2022 Sep 1;78(Pt 9):1090-1098. doi: 10.1107/S2059798322007732. Epub 2022 Aug 25.
3
The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions.模型构建和稀疏数据区域精修趋势的好坏:过度拟合的有害形式与良好的新工具和预测。
Acta Crystallogr D Struct Biol. 2023 Dec 1;79(Pt 12):1071-1078. doi: 10.1107/S2059798323008847. Epub 2023 Nov 3.
4
Interactive comparison and remediation of collections of macromolecular structures.大分子结构集合的交互式比较与修正
Protein Sci. 2018 Jan;27(1):182-194. doi: 10.1002/pro.3296. Epub 2017 Nov 11.
5
Buccaneer model building with neural network fragment selection.基于神经网络片段选择的海盗模型构建。
Acta Crystallogr D Struct Biol. 2023 Apr 1;79(Pt 4):326-338. doi: 10.1107/S205979832300181X. Epub 2023 Mar 28.
6
Current developments in Coot for macromolecular model building of Electron Cryo-microscopy and Crystallographic Data.Coot 在电子冷冻显微镜和晶体学数据的大分子模型构建方面的最新进展。
Protein Sci. 2020 Apr;29(4):1069-1078. doi: 10.1002/pro.3791. Epub 2020 Mar 2.
7
Sequence assignment for low-resolution modelling of protein crystal structures.序列分配用于蛋白质晶体结构的低分辨率建模。
Acta Crystallogr D Struct Biol. 2019 Aug 1;75(Pt 8):753-763. doi: 10.1107/S2059798319009392. Epub 2019 Jul 31.
8
Structural analysis of glycoproteins: building N-linked glycans with Coot.糖蛋白的结构分析:用 Coot 构建 N-连接聚糖。
Acta Crystallogr D Struct Biol. 2018 Apr 1;74(Pt 4):256-263. doi: 10.1107/S2059798318005119. Epub 2018 Apr 6.
9
Visualizing and quantifying molecular goodness-of-fit: small-probe contact dots with explicit hydrogen atoms.可视化和量化分子拟合优度:带有显式氢原子的小探针接触点
J Mol Biol. 1999 Jan 29;285(4):1711-33. doi: 10.1006/jmbi.1998.2400.
10
Improving amino-acid identification, fit and C(alpha) prediction using the Simplex method in automated model building.在自动模型构建中使用单纯形法改进氨基酸识别、拟合和C(α)预测。
Acta Crystallogr D Biol Crystallogr. 2006 Nov;62(Pt 11):1401-6. doi: 10.1107/S0907444906034019. Epub 2006 Oct 18.

引用本文的文献

1
Leveraging core enzyme structures for microbiota targeted functional regulation: Urease as an example.利用核心酶结构进行微生物群靶向功能调控:以脲酶为例。
Imeta. 2025 Apr 16;4(3):e70032. doi: 10.1002/imt2.70032. eCollection 2025 Jun.
2
Potent Cross-neutralizing Antibodies Reveal Vulnerabilities of Henipavirus Fusion Glycoprotein.强效交叉中和抗体揭示了亨尼帕病毒融合糖蛋白的脆弱性。
Adv Sci (Weinh). 2025 Jul;12(27):e2501996. doi: 10.1002/advs.202501996. Epub 2025 Apr 29.
3
Tamsulosin ameliorates bone loss by inhibiting the release of Cl through wedging into an allosteric site of TMEM16A.

本文引用的文献

1
The predictive power of data-processing statistics.数据处理统计的预测能力。
IUCrJ. 2020 Feb 27;7(Pt 2):342-354. doi: 10.1107/S2052252520000895. eCollection 2020 Mar 1.
2
Comparison of automated crystallographic model-building pipelines.自动化晶体学模型构建管道的比较。
Acta Crystallogr D Struct Biol. 2019 Dec 1;75(Pt 12):1119-1128. doi: 10.1107/S2059798319014918. Epub 2019 Nov 22.
3
CAB: a cyclic automatic model-building procedure.CAB:一种循环自动建模程序。
坦索罗辛通过楔入TMEM16A的变构位点抑制氯离子释放,从而改善骨质流失。
Proc Natl Acad Sci U S A. 2025 Jan 7;122(1):e2407493121. doi: 10.1073/pnas.2407493121. Epub 2024 Dec 31.
4
Integrating machine learning to advance epitope mapping.整合机器学习以推进表位作图。
Front Immunol. 2024 Sep 30;15:1463931. doi: 10.3389/fimmu.2024.1463931. eCollection 2024.
5
Buccaneer model building with neural network fragment selection.基于神经网络片段选择的海盗模型构建。
Acta Crystallogr D Struct Biol. 2023 Apr 1;79(Pt 4):326-338. doi: 10.1107/S205979832300181X. Epub 2023 Mar 28.
6
Heterologous boost with mRNA vaccines against SARS-CoV-2 Delta/Omicron variants following an inactivated whole-virus vaccine.异源加强免疫接种 SARS-CoV-2 Delta/Omicron 变异株 mRNA 疫苗,此前接种过灭活全病毒疫苗。
Antiviral Res. 2023 Apr;212:105556. doi: 10.1016/j.antiviral.2023.105556. Epub 2023 Mar 5.
7
A Guide to In Silico Drug Design.计算机辅助药物设计指南。
Pharmaceutics. 2022 Dec 23;15(1):49. doi: 10.3390/pharmaceutics15010049.
8
Exploiting Sequence-Dependent Rotamer Information in Global Optimization of Proteins.利用序列相关构象信息进行蛋白质全局优化。
J Phys Chem B. 2022 Oct 27;126(42):8381-8390. doi: 10.1021/acs.jpcb.2c04647. Epub 2022 Oct 18.
9
ModelCraft: an advanced automated model-building pipeline using Buccaneer.ModelCraft:一个使用 Buccaneer 的高级自动化模型构建流水线。
Acta Crystallogr D Struct Biol. 2022 Sep 1;78(Pt 9):1090-1098. doi: 10.1107/S2059798322007732. Epub 2022 Aug 25.
10
Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.蛋白质科学与人工智能相遇:跨领域的系统评价与生化荟萃分析
Front Bioeng Biotechnol. 2022 Jul 7;10:788300. doi: 10.3389/fbioe.2022.788300. eCollection 2022.
Acta Crystallogr D Struct Biol. 2018 Nov 1;74(Pt 11):1096-1104. doi: 10.1107/S2059798318013438. Epub 2018 Oct 29.
4
Distributed computing for macromolecular crystallography.用于大分子晶体学的分布式计算。
Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):143-151. doi: 10.1107/S2059798317014565.
5
Macromolecular refinement by model morphing using non-atomic parameterizations.使用非原子参数化的模型变形进行大分子精修。
Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):125-131. doi: 10.1107/S205979831701350X.
6
CCP4i2: the new graphical user interface to the CCP4 program suite.CCP4i2:CCP4 程序套件的全新图形用户界面。
Acta Crystallogr D Struct Biol. 2018 Feb 1;74(Pt 2):68-84. doi: 10.1107/S2059798317016035.
7
Overview of refinement procedures within REFMAC5: utilizing data from different sources.REFMAC5 精修过程概述:利用来自不同来源的数据。
Acta Crystallogr D Struct Biol. 2018 Mar 1;74(Pt 3):215-227. doi: 10.1107/S2059798318000979. Epub 2018 Mar 2.
8
MolProbity: More and better reference data for improved all-atom structure validation.MolProbity:用于改进全原子结构验证的更多更好的参考数据。
Protein Sci. 2018 Jan;27(1):293-315. doi: 10.1002/pro.3330. Epub 2017 Nov 27.
9
Unsaturated fatty acids as high-affinity ligands of the C-terminal Per-ARNT-Sim domain from the Hypoxia-inducible factor 3α.不饱和脂肪酸作为缺氧诱导因子3α C末端Per-ARNT-Sim结构域的高亲和力配体。
Sci Rep. 2015 Aug 3;5:12698. doi: 10.1038/srep12698.
10
Distributed structure determination at the JCSG.联合结构基因组学中心的分布式结构测定
Acta Crystallogr D Biol Crystallogr. 2011 Apr;67(Pt 4):368-75. doi: 10.1107/S0907444910039934. Epub 2011 Mar 18.