Takaoka Yuji, Endo Yutaka, Yamanobe Susumu, Kakinuma Hiroyuki, Okubo Taketoshi, Shimazaki Youichi, Ota Tomomi, Sumiya Shigeyuki, Yoshikawa Kensei
Molecular Simulation Group, Research Center, Taisho Pharmaceutical Co., Ltd., 1-403 Yoshino-cho, Kita-ku, Saitama-shi, 331-9530 Saitama, Japan.
J Chem Inf Comput Sci. 2003 Jul-Aug;43(4):1269-75. doi: 10.1021/ci034043l.
The concept of drug-likeness, an important characteristic for any compound in a screening library, is nevertheless difficult to pin down. Based on our belief that this concept is implicit within the collective experience of working chemists, we devised a data set to capture an intuitive human understanding of both this characteristic and ease of synthesis, a second key characteristic. Five chemists assigned a pair of scores to each of 3980 diverse compounds, with the component scores of each pair corresponding to drug-likeness and ease of synthesis, respectively. Using this data set, we devised binary classifiers with an artificial neural network and a support vector machine. These models were found to efficiently eliminate compounds that are not drug-like and/or hard-to-synthesize derivatives, demonstrating the suitability of these models for use as compound acquisition filters.
类药性质的概念,是筛选文库中任何化合物的一个重要特征,但却难以确切定义。基于我们的信念,即这个概念隐含在有经验的化学家的集体经验之中,我们设计了一个数据集,以获取人类对这一特征以及合成简易性(第二个关键特征)的直观理解。五位化学家为3980种不同化合物中的每一种都给出了一对分数,每对分数中的组成分数分别对应类药性质和合成简易性。利用这个数据集,我们用人工神经网络和支持向量机构建了二元分类器。发现这些模型能够有效地剔除那些不具有类药性质和/或难以合成的衍生物,证明了这些模型适合用作化合物获取过滤器。