Fuchs Jens-Alexander, Grisoni Francesca, Kossenjans Michael, Hiss Jan A, Schneider Gisbert
Swiss Federal Institute of Technology (ETH) , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , 8093 Zürich , Switzerland . Email:
University of Milano-Bicocca , Department of Earth and Environmental Sciences , p.za della Scienza 1 , 20126 Milano , Italy.
Medchemcomm. 2018 Aug 22;9(9):1538-1546. doi: 10.1039/c8md00370j. eCollection 2018 Sep 1.
Lipophilicity prediction is routinely applied to small molecules and presents a working alternative to experimental log or log determination. For compounds outside the domain of classical medicinal chemistry these predictions lack accuracy, advocating the development of bespoke approaches. Peptides and their derivatives and mimetics fill the structural gap between small synthetic drugs and genetically engineered macromolecules. Here, we present a data-driven machine learning method for peptide log prediction. A model for estimating the lipophilicity of short linear peptides consisting of natural amino acids was developed. In a prospective test, we obtained accurate predictions for a set of newly synthesized linear tri- to hexapeptides. Further model development focused on more complex peptide mimetics from the AstraZeneca compound collection. The results obtained demonstrate the applicability of the new prediction model to peptides and peptide derivatives in a log range of approximately -3 to 5, with superior accuracy to established lipophilicity models for small molecules.
亲脂性预测通常应用于小分子,是实验测定log 或log 的一种可行替代方法。对于经典药物化学领域之外的化合物,这些预测缺乏准确性,这就促使人们开发定制方法。肽及其衍生物和模拟物填补了小分子合成药物与基因工程大分子之间的结构空白。在此,我们提出一种数据驱动的机器学习方法用于肽的log 预测。开发了一个用于估计由天然氨基酸组成的短线性肽亲脂性的模型。在一项前瞻性测试中,我们对一组新合成的线性三肽至六肽获得了准确的预测。进一步的模型开发聚焦于来自阿斯利康化合物库的更复杂的肽模拟物。所获得的结果证明了新的预测模型在log 范围约为 -3至5时对肽和肽衍生物的适用性,其准确性优于已有的小分子亲脂性模型。