Suppr超能文献

阅读 PDB:从 3D 原子坐标感知分子。

Reading PDB: perception of molecules from 3D atomic coordinates.

机构信息

Center for Bioinformatics (ZBH), University of Hamburg, Bundesstrasse 43, 20146 Hamburg, Germany.

出版信息

J Chem Inf Model. 2013 Jan 28;53(1):76-87. doi: 10.1021/ci300358c. Epub 2012 Dec 17.

Abstract

The analysis of small molecule crystal structures is a common way to gather valuable information for drug development. The necessary structural data is usually provided in specific file formats containing only element identities and three-dimensional atomic coordinates as reliable chemical information. Consequently, the automated perception of molecular structures from atomic coordinates has become a standard task in cheminformatics. The molecules generated by such methods must be both chemically valid and reasonable to provide a reliable basis for subsequent calculations. This can be a difficult task since the provided coordinates may deviate from ideal molecular geometries due to experimental uncertainties or low resolution. Additionally, the quality of the input data often differs significantly thus making it difficult to distinguish between actual structural features and mere geometric distortions. We present a method for the generation of molecular structures from atomic coordinates based on the recently published NAOMI model. By making use of this consistent chemical description, our method is able to generate reliable results even with input data of low quality. Molecules from 363 Protein Data Bank (PDB) entries could be perceived with a success rate of 98%, a result which could not be achieved with previously described methods. The robustness of our approach has been assessed by processing all small molecules from the PDB and comparing them to reference structures. The complete data set can be processed in less than 3 min, thus showing that our approach is suitable for large scale applications.

摘要

小分子晶体结构分析是一种收集药物开发有价值信息的常用方法。必要的结构数据通常以特定的文件格式提供,这些格式仅包含元素身份和三维原子坐标作为可靠的化学信息。因此,从原子坐标自动感知分子结构已成为化学信息学中的一项标准任务。此类方法生成的分子必须在化学上有效且合理,以为后续计算提供可靠的基础。由于提供的坐标可能由于实验不确定性或低分辨率而偏离理想分子几何形状,因此这可能是一项艰巨的任务。此外,输入数据的质量通常差异很大,因此难以区分实际结构特征和纯粹的几何变形。我们提出了一种基于最近发表的 NAOMI 模型从原子坐标生成分子结构的方法。通过利用这种一致的化学描述,即使输入数据质量较低,我们的方法也能够生成可靠的结果。我们的方法能够以 98%的成功率感知来自 363 个蛋白质数据库 (PDB) 条目的分子,这是以前描述的方法无法实现的结果。我们通过处理 PDB 中的所有小分子并将其与参考结构进行比较来评估方法的稳健性。完整的数据集可以在不到 3 分钟的时间内处理,这表明我们的方法适用于大规模应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验