Suppr超能文献

如何破解SMILES:使用分子解析器跨多个服务自动交叉核对化学结构解析

How to crack a SMILES: automatic crosschecked chemical structure resolution across multiple services using MoleculeResolver.

作者信息

Müller Simon

机构信息

Institute of Thermal Separation Processes, Hamburg University of Technology, Eißendorfer Straße 38, 21073, Hamburg, Germany.

出版信息

J Cheminform. 2025 Aug 4;17(1):117. doi: 10.1186/s13321-025-01064-7.

Abstract

Accurate chemical structure resolution from textual identifiers such as names and CAS RN® is critical for computational modeling in chemistry and related fields. This paper introduces MoleculeResolver, an automated, robust Python-based tool designed to address inconsistencies and inaccuracies commonly encountered when converting chemical identifiers to canonical SMILES strings. MoleculeResolver systematically crosschecks structures retrieved from multiple reputable chemical databases, implements rigorous identifier plausibility checks, standardizes molecular structures, and intelligently selects the most accurate representation based on a unique resolution algorithm. SCIENTIFIC CONTRIBUTION: Benchmarks across diverse datasets confirm that MoleculeResolver significantly enhances precision, recall, and overall reliability compared to traditional single-source methods, proving its utility as a valuable resource for chemists, data scientists, and researchers engaged in high-quality molecular data analysis and predictive model development.

摘要

从名称和CAS RN®等文本标识符中准确解析化学结构,对于化学及相关领域的计算建模至关重要。本文介绍了MoleculeResolver,这是一个基于Python的自动化、强大的工具,旨在解决将化学标识符转换为标准SMILES字符串时常见的不一致性和不准确问题。MoleculeResolver系统地交叉检查从多个知名化学数据库检索到的结构,实施严格的标识符合理性检查,标准化分子结构,并基于独特的解析算法智能地选择最准确的表示形式。科学贡献:跨不同数据集的基准测试证实,与传统的单源方法相比,MoleculeResolver显著提高了精度、召回率和整体可靠性,证明了它作为化学家、数据科学家以及从事高质量分子数据分析和预测模型开发的研究人员的宝贵资源的效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94bf/12323220/482532e61711/13321_2025_1064_Figa_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验