Suppr超能文献

自动评估 PubChem 化合物数据库中的一致性。

Automated evaluation of consistency within the PubChem Compound database.

机构信息

Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02215, USA.

National Magnetic Resonance Facility at Madison and BioMagResBank, Department of Biochemistry, University of Wisconsin Madison, Madison, Wisconsin 53706, USA.

出版信息

Sci Data. 2019 Feb 19;6:190023. doi: 10.1038/sdata.2019.23.

Abstract

Identification of discrepant data in aggregated databases is a key step in data curation and remediation. We have applied the ALATIS approach, which is based on the international chemical shift identifier (InChI) model, to the full PubChem Compound database to generate unique and reproducible compound and atom identifiers for all entries for which three-dimensional structures were available. This exercise also served to identify entries with discrepancies between structures and chemical formulas or InChI strings. The use of unique compound identifiers and atom nomenclature should support more rigorous links between small-molecule databases including those containing atom-specific information of the type available from crystallography and spectroscopy. The comprehensive results from this analysis are publicly available through our webserver [http://alatis.nmrfam.wisc.edu/].

摘要

在聚合数据库中识别不一致的数据是数据管理和修复的关键步骤。我们已经将基于国际化学标记符 (InChI) 模型的 ALATIS 方法应用于完整的 PubChem 化合物数据库,为所有具有三维结构的条目生成唯一且可重复的化合物和原子标识符。这项工作还用于识别结构与化学公式或 InChI 字符串之间存在差异的条目。使用唯一的化合物标识符和原子命名法应该支持小分子数据库之间更严格的链接,包括那些包含晶体学和光谱学等类型的原子特定信息的数据库。通过我们的网络服务器 [http://alatis.nmrfam.wisc.edu/],可以公开获得此分析的综合结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b0a/6380220/ed1e20a82b5c/sdata201923-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验