Suppr超能文献

致力于 Chemoinformatics 中包括 InChI V2 在内的互变异构现象的全面处理。

Toward a Comprehensive Treatment of Tautomerism in Chemoinformatics Including in InChI V2.

机构信息

Computer-Aided Drug Design Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, NIH, Frederick, Maryland 21702, United States.

Xemistry GmbH, Hainholzweg 11, D-61479 Glashütten, Germany.

出版信息

J Chem Inf Model. 2020 Mar 23;60(3):1253-1275. doi: 10.1021/acs.jcim.9b01080. Epub 2020 Mar 10.

Abstract

We have collected 86 different transforms of tautomeric interconversions. Out of those, 54 are for prototropic (non-ring-chain) tautomerism, 21 for ring-chain tautomerism, and 11 for valence tautomerism. The majority of these rules have been extracted from experimental literature. Twenty rules, covering the most well-known types of tautomerism such as keto-enol tautomerism, were taken from the default handling of tautomerism by the chemoinformatics toolkit CACTVS. The rules were analyzed against nine differerent databases totaling over 400 million (non-unique) structures as to their occurrence rates, mutual overlap in coverage, and recapitulation of the rules' enumerated tautomer sets by InChI V.1.05, both in InChI's Standard and a Nonstandard version with the increased tautomer-handling options 15T and KET turned on. These results and the background of this study are discussed in the context of the IUPAC InChI Project tasked with the redesign of handling of tautomerism for an InChI version 2. Applying the rules presented in this paper would approximately triple the number of compounds in typical small-molecule databases that would be affected by tautomeric interconversion by InChI V2. A web tool has been created to test these rules at https://cactus.nci.nih.gov/tautomerizer.

摘要

我们收集了 86 种不同的互变异构转化形式。其中,54 种是质子转移(非环链)互变异构,21 种是环链互变异构,11 种是价互变异构。这些规则中的大多数都是从实验文献中提取出来的。有 20 条规则,涵盖了酮式-烯醇互变异构等最常见的互变异构类型,这些规则是从 chemoinformatics 工具包 CACTVS 对互变异构的默认处理中提取出来的。这些规则针对九个不同的数据库进行了分析,这些数据库总共包含超过 4 亿个(非唯一)结构,以评估它们的出现率、覆盖范围的相互重叠以及 InChI V.1.05 对规则枚举的互变异构集的概括能力,InChI 包括标准版本和非标准版本,其中增加了 15T 和 KET 等增加互变异构处理选项。本文讨论了这些结果和本研究的背景,该研究是 IUPAC InChI 项目的一部分,该项目负责重新设计 InChI 版本 2 中对互变异构的处理。应用本文提出的规则,大约会使受 InChI V2 互变异构影响的典型小分子数据库中的化合物数量增加两倍。我们创建了一个网络工具,可以在 https://cactus.nci.nih.gov/tautomerizer 上测试这些规则。

相似文献

1
Toward a Comprehensive Treatment of Tautomerism in Chemoinformatics Including in InChI V2.
J Chem Inf Model. 2020 Mar 23;60(3):1253-1275. doi: 10.1021/acs.jcim.9b01080. Epub 2020 Mar 10.
2
Tautomer Database: A Comprehensive Resource for Tautomerism Analyses.
J Chem Inf Model. 2020 Mar 23;60(3):1090-1100. doi: 10.1021/acs.jcim.9b01156. Epub 2020 Mar 10.
3
Tautomerism in large databases.
J Comput Aided Mol Des. 2010 Jun;24(6-7):521-51. doi: 10.1007/s10822-010-9346-4. Epub 2010 May 29.
4
Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples.
J Chem Inf Model. 2016 Nov 28;56(11):2149-2161. doi: 10.1021/acs.jcim.6b00338. Epub 2016 Oct 16.
5
Enumeration of ring-chain tautomers based on SMIRKS rules.
J Chem Inf Model. 2014 Sep 22;54(9):2423-32. doi: 10.1021/ci500363p. Epub 2014 Sep 9.
6
Tautomerism of Warfarin: Combined Chemoinformatics, Quantum Chemical, and NMR Investigation.
J Org Chem. 2015 Oct 16;80(20):9900-9. doi: 10.1021/acs.joc.5b01370. Epub 2015 Sep 25.
7
Ambit-Tautomer: An Open Source Tool for Tautomer Generation.
Mol Inform. 2013 Jun;32(5-6):481-504. doi: 10.1002/minf.201200133. Epub 2013 Jun 3.
8
Tautomeric Conflicts in Forty Small-Molecule Databases.
J Chem Inf Model. 2024 Oct 14;64(19):7409-7421. doi: 10.1021/acs.jcim.4c00700. Epub 2024 Sep 24.
9
Internet resources integrating many small-molecule databases.
SAR QSAR Environ Res. 2008 Jan-Mar;19(1-2):1-9. doi: 10.1080/10629360701843540.
10
Tautomer identification and tautomer structure generation based on the InChI code.
J Chem Inf Model. 2010 Jul 26;50(7):1223-32. doi: 10.1021/ci1001179.

引用本文的文献

1
Machine Learning for Toxicity Prediction Using Chemical Structures: Pillars for Success in the Real World.
Chem Res Toxicol. 2025 May 19;38(5):759-807. doi: 10.1021/acs.chemrestox.5c00033. Epub 2025 May 2.
2
Fast and Accurate Prediction of Tautomer Ratios in Aqueous Solution via a Siamese Neural Network.
J Chem Theory Comput. 2025 Mar 25;21(6):3132-3141. doi: 10.1021/acs.jctc.5c00041. Epub 2025 Mar 16.
3
What impact does tautomerism have on drug discovery and development?
Expert Opin Drug Discov. 2024 Sep;19(9):1011-1016. doi: 10.1080/17460441.2024.2379873. Epub 2024 Jul 16.
4
QupKake: Integrating Machine Learning and Quantum Chemistry for Micro-p Predictions.
J Chem Theory Comput. 2024 Aug 13;20(15):6946-6956. doi: 10.1021/acs.jctc.4c00328. Epub 2024 Jun 4.
5
7
Challenges and perspectives for naming lipids in the context of lipidomics.
Metabolomics. 2024 Jan 24;20(1):15. doi: 10.1007/s11306-023-02075-x.
8
The in-silico evaluation of important GLUT9 residue for uric acid transport based on renal hypouricemia type 2.
Chem Biol Interact. 2023 Mar 1;373:110378. doi: 10.1016/j.cbi.2023.110378. Epub 2023 Feb 1.
10
Guided discovery of chemical reaction pathways with imposed activation.
Chem Sci. 2022 Nov 10;13(46):13857-13871. doi: 10.1039/d2sc05135d. eCollection 2022 Nov 30.

本文引用的文献

1
Adapting CHMTRN (CHeMistry TRaNslator) for a New Use.
J Chem Inf Model. 2020 Jul 27;60(7):3336-3341. doi: 10.1021/acs.jcim.0c00448. Epub 2020 Jul 1.
2
Tautomer Database: A Comprehensive Resource for Tautomerism Analyses.
J Chem Inf Model. 2020 Mar 23;60(3):1090-1100. doi: 10.1021/acs.jcim.9b01156. Epub 2020 Mar 10.
3
: An Open Tautomer Database.
J Chem Inf Model. 2020 Mar 23;60(3):1085-1089. doi: 10.1021/acs.jcim.0c00035. Epub 2020 Jan 30.
4
DrugBank 5.0: a major update to the DrugBank database for 2018.
Nucleic Acids Res. 2018 Jan 4;46(D1):D1074-D1082. doi: 10.1093/nar/gkx1037.
5
ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost.
Chem Sci. 2017 Apr 1;8(4):3192-3203. doi: 10.1039/c6sc05720a. Epub 2017 Feb 8.
6
The ChEMBL database in 2017.
Nucleic Acids Res. 2017 Jan 4;45(D1):D945-D954. doi: 10.1093/nar/gkw1074. Epub 2016 Nov 28.
7
Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples.
J Chem Inf Model. 2016 Nov 28;56(11):2149-2161. doi: 10.1021/acs.jcim.6b00338. Epub 2016 Oct 16.
8
PubChem Substance and Compound databases.
Nucleic Acids Res. 2016 Jan 4;44(D1):D1202-13. doi: 10.1093/nar/gkv951. Epub 2015 Sep 22.
9
InChI, the IUPAC International Chemical Identifier.
J Cheminform. 2015 May 30;7:23. doi: 10.1186/s13321-015-0068-4. eCollection 2015.
10
Enumeration of ring-chain tautomers based on SMIRKS rules.
J Chem Inf Model. 2014 Sep 22;54(9):2423-32. doi: 10.1021/ci500363p. Epub 2014 Sep 9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验