King R D, Muggleton S, Lewis R A, Sternberg M J
Department of Statistics, Strathclyde University, Glasgow, United Kingdom.
Proc Natl Acad Sci U S A. 1992 Dec 1;89(23):11322-6. doi: 10.1073/pnas.89.23.11322.
The machine learning program GOLEM from the field of inductive logic programming was applied to the drug design problem of modeling structure-activity relationships. The training data for the program were 44 trimethoprim analogues and their observed inhibition of Escherichia coli dihydrofolate reductase. A further 11 compounds were used as unseen test data. GOLEM obtained rules that were statistically more accurate on the training data and also better on the test data than a Hansch linear regression model. Importantly machine learning yields understandable rules that characterized the chemistry of favored inhibitors in terms of polarity, flexibility, and hydrogen-bonding character. These rules agree with the stereochemistry of the interaction observed crystallographically.
来自归纳逻辑编程领域的机器学习程序GOLEM被应用于药物设计中对构效关系进行建模的问题。该程序的训练数据是44种甲氧苄啶类似物及其对大肠杆菌二氢叶酸还原酶的观察抑制作用。另外11种化合物用作未见过的测试数据。与Hansch线性回归模型相比,GOLEM获得的规则在训练数据上统计上更准确,在测试数据上也更好。重要的是,机器学习产生了易于理解的规则,这些规则根据极性、柔韧性和氢键特征描述了有利抑制剂的化学性质。这些规则与晶体学观察到的相互作用的立体化学一致。