Raveendrakumar E, Gopichand B, Bhosale H, Melethadathil N, Valadi J
Amrita School of Biotechnology, Amrita Vishwa Vidyapeetham, Amritapuri, India.
School of Computing and Data Sciences, FLAME University, Pune, India.
SAR QSAR Environ Res. 2024 Dec;35(12):1155-1171. doi: 10.1080/1062936X.2024.2446352. Epub 2025 Jan 8.
This study illustrates the use of chemical fingerprints with machine learning for blood-brain barrier (BBB) permeability prediction. Employing the Blood Brain Barrier Database (B3DB) dataset for BBB permeability prediction, we extracted nine different fingerprints. Support Vector Machine (SVM) and Extreme Gradient Boosting (XGBoost) algorithms were used to develop models for permeability prediction. Random Forest recursive Feature Selection (RF-RFS) method was used for extracting informative attributes. An additional database was employed for the validation phase. The results indicate that all nine datasets achieved good performance in training, test and validation stages. We further took MACC Keys fingerprints, one of the best performing models for explainability analysis. For this purpose, we used SHapley Additive exPlanations (SHAP) analysis on this dataset for the identification of key structural features influencing BBB permeability prediction. These features include aliphatic carbons, methyl groups and oxygen-containing groups. This study highlights the effectiveness of different fingerprint descriptors in predicting BBB permeability. SHAP analysis provides value additions to the simulations. These simulations will be of significant help in drug discovery processes, particularly in developing Central Nervous System (CNS) therapeutics.
本研究阐述了结合化学指纹图谱与机器学习进行血脑屏障(BBB)通透性预测的方法。利用血脑屏障数据库(B3DB)数据集进行BBB通透性预测,我们提取了九种不同的指纹图谱。支持向量机(SVM)和极端梯度提升(XGBoost)算法被用于开发通透性预测模型。随机森林递归特征选择(RF-RFS)方法用于提取信息属性。另一个数据库用于验证阶段。结果表明,所有九个数据集在训练、测试和验证阶段均表现良好。我们进一步选取了表现最佳的模型之一MACC Keys指纹图谱进行可解释性分析。为此,我们对该数据集使用了SHapley加性解释(SHAP)分析,以识别影响BBB通透性预测的关键结构特征。这些特征包括脂肪族碳、甲基和含氧基团。本研究突出了不同指纹描述符在预测BBB通透性方面的有效性。SHAP分析为模拟提供了附加价值。这些模拟将对药物发现过程,特别是中枢神经系统(CNS)治疗药物的开发有很大帮助。