Suppr超能文献

蛋白质数据库中的重复条目:如何检测与处理

Duplicate entries in the Protein Data Bank: how to detect and handle them.

作者信息

Wlodawer Alexander, Dauter Zbigniew, Rubach Pawel, Minor Wladek, Jaskolski Mariusz, Jiang Ziqiu, Jeffcott William, Anosova Olga, Kurlin Vitaliy

机构信息

Center for Structural Biology, Center for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA.

Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA.

出版信息

Acta Crystallogr D Struct Biol. 2025 Apr 1;81(Pt 4):170-180. doi: 10.1107/S2059798325001883. Epub 2025 Mar 8.

Abstract

A global analysis of protein crystal structures in the Protein Data Bank (PDB) using a newly developed computational approach reveals many pairs with (nearly) identical main-chain coordinates. Such cases are identified and analyzed, showing that duplication is possible since the PDB does not currently have tools or mechanisms that would detect potentially duplicate submissions. Some duplicated entries represent modeling efforts of ligand binding that masquerade as experimentally determined structures. We propose that duplicate entries should either be obsoleted by the PDB or, as a minimum, marked with a clear `CAVEAT' record that would alert potential users to the presence of such problems. We also suggest that using a tool for verifying the uniqueness of the deposited structure, such as that presented in this work, should become part of the routine validation procedure for new depositions.

摘要

使用一种新开发的计算方法对蛋白质数据库(PDB)中的蛋白质晶体结构进行全局分析,发现许多(几乎)具有相同主链坐标的配对。此类情况已被识别和分析,结果表明存在重复提交的可能性,因为PDB目前没有能够检测潜在重复提交的工具或机制。一些重复条目代表配体结合的建模成果,却伪装成实验确定的结构。我们建议PDB要么废弃重复条目,要么至少标记一条明确的“注意事项”记录,以提醒潜在用户存在此类问题。我们还建议,使用如本文所介绍的用于验证所提交结构唯一性的工具,应成为新提交条目的常规验证程序的一部分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f17e/11966240/90495a0cdf58/d-81-00170-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验