Syngenta, Jealott's Hill International Research Centre, Bracknell, Berkshire RG42 6EY, U.K.
Syngenta Crop Protection, Schaffhauserstrasse, Stein CH-4332, Switzerland.
J Chem Inf Model. 2020 Aug 24;60(8):3781-3791. doi: 10.1021/acs.jcim.0c00232. Epub 2020 Jul 23.
Databases of small, potentially bioactive molecules are ubiquitous across the industry and academia. Designed such that each unique compound should appear only once, the multiplicity of ways in which many compounds can be represented means that these databases require methods for standardizing the representation of chemistry. This is commonly achieved through the use of "Chemistry Business Rules", sets of predefined rules that describe the "house style" of the database in question. At Syngenta, the historical approach to the design of chemistry business rules has been to focus on consistency of representation, with chemical relevance given secondary consideration. In this work, we overturn that convention. Through the use of quantum chemistry calculations, we define a set of chemistry business rules for tautomer standardization that reproduces gas-phase energetic preferences. We go on to show that, compared to our historic approach, this method yields tautomers that are in better agreement with those observed experimentally in condensed phases and that are better suited for use in predictive models.
数据库中的小分子,具有潜在生物活性的分子在工业和学术界无处不在。设计时,每个独特的化合物都应该只出现一次,但由于许多化合物可以有多种表示方式,因此这些数据库需要标准化化学表示的方法。这通常通过使用“化学业务规则”来实现,这些规则集定义了所讨论数据库的“风格”。在先正达,设计化学业务规则的历史方法一直侧重于表示的一致性,而次要考虑化学相关性。在这项工作中,我们颠覆了这一传统。通过使用量子化学计算,我们为互变异构体标准化定义了一组化学业务规则,这些规则再现了气相能量偏好。我们接着证明,与我们的历史方法相比,这种方法产生的互变异构体与在凝聚相中观察到的实验结果更一致,并且更适合用于预测模型。