Mahdi Wael A, Alhowyan Adel, Obaidullah Ahmad J
Department of Pharmaceutics, College of Pharmacy, King Saud University, P.O. Box 2457, 11451, Riyadh, Saudi Arabia.
Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, P.O. Box 2457, 11451, Riyadh, Saudi Arabia.
Sci Rep. 2025 Jan 6;15(1):1017. doi: 10.1038/s41598-024-84450-9.
This study focuses on the use of machine learning (ML) models to predict the biodistribution of nanoparticles in various organs, using a dataset derived from research on nanoparticle behavior for cancer treatment. The dataset includes both categorical and numerical variables related to nanoparticle properties, with a focus on their distribution across organs such as the tumor, heart, liver, spleen, lung, and kidney tissues. In order to address the complex and non-linear nature of the data, three machine learning models were utilized: Bayesian Ridge Regression (BRR), Kernel Ridge Regression (KRR), and K-Nearest Neighbors (KNN). The selection of these models was based on their wide range of capabilities in dealing with non-linear relationships and data complexity. To further model performance and strength, the study also applied cutting-edge methods including the Firefly Algorithm for hyperparameter tuning and Recursive Feature Elimination (RFE) for feature selection. Based on higher R² and lower RMSE values for most output parameters, the study concluded that Kernel Ridge Regression (KRR) did better compared to other models in predicting biodistribution outcomes. The study revealed that machine learning models, particularly KRR, exhibit a high level of efficiency in accurately representing the non-linear characteristics of nanoparticle biodistribution. The results obtained provide valuable insights into the optimization of predictive models for the behavior of nanoparticles. These models can be further enhanced by the use of advanced feature selection and hyperparameter tuning techniques.
本研究聚焦于使用机器学习(ML)模型来预测纳米颗粒在各个器官中的生物分布,所使用的数据集源自癌症治疗中纳米颗粒行为的研究。该数据集包括与纳米颗粒特性相关的分类变量和数值变量,重点关注其在肿瘤、心脏、肝脏、脾脏、肺和肾脏组织等器官中的分布。为了解决数据的复杂和非线性性质,使用了三种机器学习模型:贝叶斯岭回归(BRR)、核岭回归(KRR)和K近邻(KNN)。选择这些模型是基于它们在处理非线性关系和数据复杂性方面的广泛能力。为了进一步提升模型性能和强度,该研究还应用了前沿方法,包括用于超参数调优的萤火虫算法和用于特征选择的递归特征消除(RFE)。基于大多数输出参数的较高R²值和较低均方根误差(RMSE)值,该研究得出结论,在预测生物分布结果方面,核岭回归(KRR)比其他模型表现更好。该研究表明,机器学习模型,尤其是KRR,在准确表征纳米颗粒生物分布的非线性特征方面表现出很高的效率。所获得的结果为优化纳米颗粒行为的预测模型提供了有价值的见解。通过使用先进的特征选择和超参数调优技术,这些模型可以得到进一步增强。