• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用深度学习预测揭示了 PDB 沉积物中大量的注册错误。

Using deep-learning predictions reveals a large number of register errors in PDB depositions.

机构信息

Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom.

European Molecular Biology Laboratory, Hamburg Unit, Notkestrasse 85, 22607 Hamburg, Germany.

出版信息

IUCrJ. 2024 Nov 1;11(Pt 6):938-950. doi: 10.1107/S2052252524009114.

DOI:10.1107/S2052252524009114
PMID:39387575
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11533997/
Abstract

The accuracy of the information in the Protein Data Bank (PDB) is of great importance for the myriad downstream applications that make use of protein structural information. Despite best efforts, the occasional introduction of errors is inevitable, especially where the experimental data are of limited resolution. A novel protein structure validation approach based on spotting inconsistencies between the residue contacts and distances observed in a structural model and those computationally predicted by methods such as AlphaFold2 has previously been established. It is particularly well suited to the detection of register errors. Importantly, this new approach is orthogonal to traditional methods based on stereochemistry or map-model agreement, and is resolution independent. Here, thousands of likely register errors are identified by scanning 3-5 Å resolution structures in the PDB. Unlike most methods, the application of this approach yields suggested corrections to the register of affected regions, which it is shown, even by limited implementation, lead to improved refinement statistics in the vast majority of cases. A few limitations and confounding factors such as fold-switching proteins are characterized, but this approach is expected to have broad application in spotting potential issues in current accessions and, through its implementation and distribution in CCP4, helping to ensure the accuracy of future depositions.

摘要

蛋白质数据库(PDB)中的信息准确性对于众多利用蛋白质结构信息的下游应用至关重要。尽管已经付出了最大努力,但偶尔引入错误是不可避免的,尤其是在实验数据分辨率有限的情况下。先前已经建立了一种基于发现结构模型中观察到的残基接触和距离与诸如 AlphaFold2 等方法计算预测的残基接触和距离之间不一致的新型蛋白质结构验证方法。它特别适合检测注册错误。重要的是,这种新方法与基于立体化学或图谱-模型一致性的传统方法正交,并且不依赖于分辨率。在这里,通过扫描 PDB 中 3-5 Å 分辨率的结构来识别数千个可能的注册错误。与大多数方法不同,该方法的应用会对受影响区域的注册进行建议性修正,即使仅进行有限的实施,也会导致绝大多数情况下改进精修统计数据。该方法还对一些局限性和混杂因素(如折叠开关蛋白)进行了特征描述,但预计该方法将广泛应用于发现当前访问中的潜在问题,并通过在 CCP4 中的实施和分发,有助于确保未来存储的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/c90f88420927/m-11-00938-fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/99668bbd0fda/m-11-00938-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/e4a63d78309f/m-11-00938-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/f98cfac25d21/m-11-00938-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/006b4dabd57f/m-11-00938-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/e25e801958f8/m-11-00938-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/a633b480e4d0/m-11-00938-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/64125051b5c5/m-11-00938-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/6a7e801ad8b7/m-11-00938-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/506703be4bdd/m-11-00938-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/57abae7cf63b/m-11-00938-fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/c90f88420927/m-11-00938-fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/99668bbd0fda/m-11-00938-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/e4a63d78309f/m-11-00938-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/f98cfac25d21/m-11-00938-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/006b4dabd57f/m-11-00938-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/e25e801958f8/m-11-00938-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/a633b480e4d0/m-11-00938-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/64125051b5c5/m-11-00938-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/6a7e801ad8b7/m-11-00938-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/506703be4bdd/m-11-00938-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/57abae7cf63b/m-11-00938-fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/44ee/11533997/c90f88420927/m-11-00938-fig11.jpg

相似文献

1
Using deep-learning predictions reveals a large number of register errors in PDB depositions.利用深度学习预测揭示了 PDB 沉积物中大量的注册错误。
IUCrJ. 2024 Nov 1;11(Pt 6):938-950. doi: 10.1107/S2052252524009114.
2
Using deep-learning predictions of inter-residue distances for model validation.使用残差间距离的深度学习预测进行模型验证。
Acta Crystallogr D Struct Biol. 2022 Dec 1;78(Pt 12):1412-1427. doi: 10.1107/S2059798322010415. Epub 2022 Nov 25.
3
Waterless structures in the Protein Data Bank.蛋白质数据库中的无水结构。
IUCrJ. 2024 Nov 1;11(Pt 6):966-976. doi: 10.1107/S2052252524009928.
4
Improved protein structure refinement guided by deep learning based accuracy estimation.基于深度学习的准确性评估指导的蛋白质结构改进精修。
Nat Commun. 2021 Feb 26;12(1):1340. doi: 10.1038/s41467-021-21511-x.
5
BetaDL: A protein beta-sheet predictor utilizing a deep learning model and independent set solution.BetaDL:一种利用深度学习模型和独立集解的蛋白质β-折叠预测器。
Comput Biol Med. 2019 Jan;104:241-249. doi: 10.1016/j.compbiomed.2018.11.021. Epub 2018 Dec 2.
6
RCSB Protein Data Bank: supporting research and education worldwide through explorations of experimentally determined and computationally predicted atomic level 3D biostructures.RCSB蛋白质数据库:通过探索实验测定和计算预测的原子水平3D生物结构,支持全球的研究与教育。
IUCrJ. 2024 May 1;11(Pt 3):279-286. doi: 10.1107/S2052252524002604.
7
Intrinsic disorder in the Protein Data Bank.蛋白质数据库中的内在无序状态。
J Biomol Struct Dyn. 2007 Feb;24(4):325-42. doi: 10.1080/07391102.2007.10507123.
8
Protein model refinement for cryo-EM maps using AlphaFold2 and the DAQ score.使用 AlphaFold2 和 DAQ 评分进行冷冻电镜图谱的蛋白质模型精修。
Acta Crystallogr D Struct Biol. 2023 Jan 1;79(Pt 1):10-21. doi: 10.1107/S2059798322011676.
9
Multivariate Analyses of Quality Metrics for Crystal Structures in the PDB Archive.蛋白质数据银行(PDB)存档中晶体结构质量指标的多变量分析。
Structure. 2017 Mar 7;25(3):458-468. doi: 10.1016/j.str.2017.01.013. Epub 2017 Feb 16.
10
Homology-based hydrogen bond information improves crystallographic structures in the PDB.基于同源性的氢键信息可改善 PDB 中的晶体结构。
Protein Sci. 2018 Mar;27(3):798-808. doi: 10.1002/pro.3353. Epub 2017 Dec 8.

引用本文的文献

1
The future of pharmaceuticals: Artificial intelligence in drug discovery and development.制药的未来:药物研发中的人工智能
J Pharm Anal. 2025 Aug;15(8):101248. doi: 10.1016/j.jpha.2025.101248. Epub 2025 Feb 26.
2
CASP16 protein monomer structure prediction assessment.半胱天冬酶16(CASP16)蛋白单体结构预测评估
bioRxiv. 2025 Jun 2:2025.05.29.656942. doi: 10.1101/2025.05.29.656942.
3
Emerging frontiers in protein structure prediction following the AlphaFold revolution.继AlphaFold革命之后蛋白质结构预测的新兴前沿领域。

本文引用的文献

1
Approximating Projections of Conformational Boltzmann Distributions with AlphaFold2 Predictions: Opportunities and Limitations.用 AlphaFold2 预测来逼近构象 Boltzmann 分布的投影:机遇与局限。
J Chem Theory Comput. 2024 Feb 13;20(3):1434-1447. doi: 10.1021/acs.jctc.3c01081. Epub 2024 Jan 12.
2
A Conserved Ribosomal Protein Has Entirely Dissimilar Structures in Different Organisms.在不同的生物体中,一个保守的核糖体蛋白具有完全不同的结构。
Mol Biol Evol. 2024 Jan 3;41(1). doi: 10.1093/molbev/msad254.
3
Tertiary structure assessment at CASP15.
J R Soc Interface. 2025 Apr;22(225):20240886. doi: 10.1098/rsif.2024.0886. Epub 2025 Apr 16.
4
multistrap: boosting phylogenetic analyses with structural information.多重带型:利用结构信息提升系统发育分析
Nat Commun. 2025 Jan 15;16(1):293. doi: 10.1038/s41467-024-55264-0.
三级结构评估在 CASP15。
Proteins. 2023 Dec;91(12):1616-1635. doi: 10.1002/prot.26593. Epub 2023 Sep 25.
4
Overall protein structure quality assessment using hydrogen-bonding parameters.利用氢键参数评估蛋白质整体结构质量。
Acta Crystallogr D Struct Biol. 2023 Aug 1;79(Pt 8):684-693. doi: 10.1107/S2059798323005077. Epub 2023 Jul 11.
5
Sequence-assignment validation in protein crystal structure models with checkMySequence.使用 checkMySequence 验证蛋白质晶体结构模型中的序列分配。
Acta Crystallogr D Struct Biol. 2023 Jul 1;79(Pt 7):559-568. doi: 10.1107/S2059798323003765. Epub 2023 Jun 14.
6
Residue-level error detection in cryoelectron microscopy models.残余误差检测在低温电子显微镜模型中。
Structure. 2023 Jul 6;31(7):860-869.e4. doi: 10.1016/j.str.2023.05.002. Epub 2023 May 29.
7
DAQ-Score Database: assessment of map-model compatibility for protein structure models from cryo-EM maps.DAQ-Score数据库:评估冷冻电镜图谱中蛋白质结构模型的图谱-模型兼容性
Nat Methods. 2023 Jun;20(6):775-776. doi: 10.1038/s41592-023-01876-1.
8
CryoRes: Local Resolution Estimation of Cryo-EM Density Maps by Deep Learning.CryoRes:基于深度学习的冷冻电镜密度图局部分辨率估计。
J Mol Biol. 2023 May 1;435(9):168059. doi: 10.1016/j.jmb.2023.168059. Epub 2023 Mar 24.
9
Errors in structural biology are not the exception.结构生物学中的错误并不罕见。
Acta Crystallogr D Struct Biol. 2023 Mar 1;79(Pt 3):206-211. doi: 10.1107/S2059798322011901. Epub 2023 Feb 27.
10
Using deep-learning predictions of inter-residue distances for model validation.使用残差间距离的深度学习预测进行模型验证。
Acta Crystallogr D Struct Biol. 2022 Dec 1;78(Pt 12):1412-1427. doi: 10.1107/S2059798322010415. Epub 2022 Nov 25.