Suppr超能文献

蛋白质晶体结构的质量。

Quality of protein crystal structures.

作者信息

Brown Eric N, Ramaswamy S

机构信息

University of Iowa, Department of Biochemistry, Iowa City, IA, USA.

出版信息

Acta Crystallogr D Biol Crystallogr. 2007 Sep;63(Pt 9):941-50. doi: 10.1107/S0907444907033847. Epub 2007 Aug 17.

Abstract

The genomics era has seen the propagation of numerous databases containing easily accessible data that are routinely used by investigators to interpret results and generate new ideas. Most investigators consider data extracted from scientific databases to be error-free. However, data generated by all experimental techniques contain errors and some, including the coordinates in the Protein Data Bank (PDB), also integrate the subjective interpretations of experimentalists. This paper explores the determinants of protein structure quality metrics used routinely by protein crystallographers. These metrics are available for most structures in the database, including the R factor, R(free), real-space correlation coefficient, Ramachandran violations etc. All structures in the PDB were analyzed for their overall quality based on nine different quality metrics. Multivariate statistical analysis revealed that while technological improvements have increased the number of structures determined, the overall quality of structures has remained constant. The quality of structures deposited by structural genomics initiatives are generally better than the quality of structures from individual investigator laboratories. The most striking result is the association between structure quality and the journal in which the structure was first published. The worst offenders are the apparently high-impact general science journals. The rush to publish high-impact work in the competitive atmosphere may have led to the proliferation of poor-quality structures.

摘要

基因组学时代见证了众多数据库的传播,这些数据库包含易于获取的数据,研究人员经常使用这些数据来解释结果并产生新想法。大多数研究人员认为从科学数据库中提取的数据是无错误的。然而,所有实验技术产生的数据都包含错误,并且一些数据,包括蛋白质数据库(PDB)中的坐标,还整合了实验人员的主观解释。本文探讨了蛋白质晶体学家经常使用的蛋白质结构质量指标的决定因素。这些指标可用于数据库中的大多数结构,包括R因子、R(自由)、实空间相关系数、拉氏构象图偏差等。基于九种不同的质量指标,对PDB中的所有结构进行了整体质量分析。多变量统计分析表明,虽然技术进步增加了确定的结构数量,但结构的整体质量保持不变。结构基因组学计划提交的结构质量通常优于单个研究人员实验室的结构质量。最引人注目的结果是结构质量与首次发表该结构的期刊之间的关联。罪魁祸首显然是那些具有高影响力的综合科学期刊。在竞争激烈的环境中急于发表高影响力的研究成果,可能导致了低质量结构的泛滥。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验