Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7545, United States.
Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States.
ACS Chem Neurosci. 2021 Jun 16;12(12):2247-2253. doi: 10.1021/acschemneuro.1c00265. Epub 2021 May 24.
The ability to calculate whether small molecules will cross the blood-brain barrier (BBB) is an important task for companies working in neuroscience drug discovery. For a decade, scientists have relied on relatively simplistic rules such as Pfizer's central nervous system multiparameter optimization models (CNS-MPO) for guidance during the drug selection process. In parallel, there has been a continued development of more sophisticated machine learning models that utilize different molecular descriptors and algorithms; however, these models represent a "black box" and are generally less interpretable. In both cases, these methods predict the ability of small molecules to cross the BBB using the molecular structure information on its own without or data. We describe here the implementation of two versions of Pfizer's algorithm (Pf-MPO.v1 and Pf-MPO.v2) and compare it with a Bayesian machine learning model of BBB penetration trained on a data set of 2296 active and inactive compounds using extended connectivity fingerprint descriptors. The predictive ability of these approaches was compared with 40 known CNS active drugs initially used by Pfizer as their positive set for validation of the Pf-MPO.v1 score. 37/40 (92.5%) compounds were predicted as active by the Bayesian model, while only 30/40 (75%) received a desirable Pf-MPO.v1 score ≥4 and 33/40 (82.5%) received a desirable Pf-MPO.v2 score ≥4, suggesting the Bayesian model is more accurate than MPO algorithms. This also indicates machine learning models are more flexible and have better predictive power for BBB penetration than simple rule sets that require multiple, accurate descriptor calculations. Our machine learning model statistics are comparable to recent published studies. We describe the implications of these findings and how machine learning may have a role alongside more interpretable methods.
计算小分子是否能穿过血脑屏障(BBB)的能力,是神经科学药物研发公司的一项重要任务。十年来,科学家们一直依赖于相对简单的规则,如辉瑞的中枢神经系统多参数优化模型(CNS-MPO),作为药物选择过程中的指导。与此同时,利用不同分子描述符和算法的更复杂的机器学习模型也在不断发展;然而,这些模型代表了一个“黑箱”,通常不太可解释。在这两种情况下,这些方法都使用小分子自身的分子结构信息来预测其穿过 BBB 的能力,而不使用或数据。我们在这里描述了辉瑞算法(Pf-MPO.v1 和 Pf-MPO.v2)的两个版本的实现,并将其与一个基于 2296 个活性和非活性化合物数据集的贝叶斯机器学习 BBB 穿透模型进行比较,该模型使用扩展连通指纹描述符进行训练。我们比较了这些方法的预测能力,与最初由辉瑞用作 Pf-MPO.v1 评分验证的阳性集的 40 种已知 CNS 活性药物进行比较。37/40(92.5%)种化合物被贝叶斯模型预测为活性,而只有 30/40(75%)种化合物获得了理想的 Pf-MPO.v1 评分≥4,33/40(82.5%)种化合物获得了理想的 Pf-MPO.v2 评分≥4,这表明贝叶斯模型比 MPO 算法更准确。这也表明,机器学习模型比需要多个准确描述符计算的简单规则集更灵活,对 BBB 穿透具有更好的预测能力。我们的机器学习模型统计数据与最近发表的研究相当。我们描述了这些发现的意义,以及机器学习如何与更具可解释性的方法一起发挥作用。