Computer-Aided Drug Design Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, NIH, Frederick, Maryland 21702, United States.
J Chem Inf Model. 2020 Mar 23;60(3):1090-1100. doi: 10.1021/acs.jcim.9b01156. Epub 2020 Mar 10.
We report a database of tautomeric structures that contains 2819 tautomeric tuples extracted from 171 publications. Each tautomeric entry has been annotated with experimental conditions reported in the respective publication, plus bibliographic details, structural identifiers (e.g., NCI/CADD identifiers FICTS, FICuS, uuuuu, and Standard InChI), and chemical information (e.g., SMILES, molecular weight). The majority of tautomeric tuples found were pairs; the remaining 10% were triples, quadruples, or quintuples, amounting to a total number of structures of 5977. The types of tautomerism were mainly prototropic tautomerism (79%), followed by ring-chain (13%) and valence tautomerism (8%). The experimental conditions reported in the publications included about 50 pure solvents and 9 solvent mixtures with 26 unique spectroscopic or nonspectroscopic methods. H and C NMR were the most frequently used methods. A total of 77 different tautomeric transform rules (SMIRKS) are covered by at least one example tuple in the database. This database is freely available as a spreadsheet at https://cactus.nci.nih.gov/download/tautomer/.
我们报告了一个包含 2819 个由 171 篇文献中提取的互变异构结构的数据库。每个互变异构条目都标注了相应文献中报告的实验条件,以及书目详细信息、结构标识符(例如,NCI/CADD 标识符 FICTS、FICuS、uuuuu 和标准 InChI)和化学信息(例如,SMILES、分子量)。找到的互变异构对主要是质子转移互变异构(79%),其次是环链(13%)和价互变异构(8%)。文献中报告的实验条件包括约 50 种纯溶剂和 9 种溶剂混合物,以及 26 种独特的光谱或非光谱方法。1H 和 13C NMR 是最常用的方法。该数据库至少涵盖了一个示例元组中的 77 种不同的互变异构转换规则(SMIRKS)。该数据库可在 https://cactus.nci.nih.gov/download/tautomer/ 作为电子表格免费获取。