Raymond John W, Kibbey Christopher E
Scientific Computing Group, Pfizer Global Research and Development, Ann Arbor Laboratories, 2800 Plymouth Road, Ann Arbor, Michigan 48105, USA.
J Chem Inf Model. 2005 Sep-Oct;45(5):1195-204. doi: 10.1021/ci0502247.
Practicing medicinal chemists tend to treat a lead compound as an assemblage of its substructural parts. By iteratively confining their synthetic efforts in a localized fashion, they are able to systematically investigate how minor changes in certain portions of the molecule effect the properties of interest in the logical expectation that the observed beneficial changes will be cumulative. One disadvantage to this approach arises when large amounts of structure data begin to accumulate which is often the case in recent times due to such developments as high-throughput screening, virtual screening, and combinatorial chemistry. How then does one interactively mine this diverse data consistent with the desired substructural template, so those desirable structural features can be discovered and interpreted, especially when they may not occur in the most active compounds due to structural deficiencies in other portions of the molecule? In this paper, we present an algorithm to automate this process that has historically been performed in an ad-hoc and manual fashion. Using the proposed method, significantly larger numbers of compounds can be analyzed in this fashion, potentially discovering useful structural feature combinations that would not have otherwise been detected due to the sheer scale of modern structural and biological data collections.
从事药物化学的化学家倾向于将先导化合物视为其亚结构部分的组合。通过以局部化的方式反复限制他们的合成工作,他们能够系统地研究分子某些部分的微小变化如何影响感兴趣的性质,并合乎逻辑地预期观察到的有益变化将是累积性的。当大量结构数据开始积累时,这种方法的一个缺点就出现了,近年来由于高通量筛选、虚拟筛选和组合化学等发展,这种情况经常发生。那么,如何与这些与所需亚结构模板一致的多样数据进行交互挖掘,以便能够发现和解释那些理想的结构特征,特别是当由于分子其他部分的结构缺陷,这些特征可能不会出现在活性最高的化合物中时?在本文中,我们提出了一种算法,用于自动化这个过去一直以临时和手动方式进行的过程。使用所提出的方法,可以以这种方式分析数量多得多的化合物,有可能发现由于现代结构和生物数据收集的规模巨大而原本无法检测到的有用结构特征组合。