Almansour Khaled, Alqahtani Arwa Sultan
Department of Pharmaceutics, College of Pharmacy, University of Hail, Hail, Saudi Arabia.
Department of Chemistry, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), P.O. Box 90950, 11623, Riyadh, Saudi Arabia.
Sci Rep. 2025 Mar 14;15(1):8840. doi: 10.1038/s41598-025-92725-y.
This study investigates utilization of machine learning for the regression task of predicting the size of PLGA (Poly lactic-co-glycolic acid) nanoparticles. Various inputs including category and numeric were considered for building the model to predict the optimum conditions for preparation of nanosized PLGA particles for drug delivery applications. The proposed methodology employs Leave-One-Out (LOO) for categorical feature transformation, Local Outlier Factor (LOF) for outlier detection, and Bat Optimization Algorithm (BA) for hyperparameter optimization. A comparative analysis compares K-Nearest Neighbors (KNN), ensemble methods such as Bagging and Adaptive Boosting (AdaBoost), and the novel Small-Size Bat-Optimized KNN Regression (SBNNR) model, which uses generative adversarial networks and deep feature extraction to improve performance on sparse datasets. Results demonstrate that ADA-KNN outperforms other models for Particle Size prediction with a test R² of 0.94385, while SBNNR achieves superior accuracy in predicting Zeta Potential with a test R² of 0.97674. These findings underscore the efficacy of combining advanced preprocessing, optimization, and ensemble techniques for robust regression modeling. The contributions of this work include the development of the SBNNR model, validation of BA's optimization capabilities, and a comprehensive evaluation of ensemble methods. This method provides a reliable framework for using machine learning in material science applications, particularly nanoparticle characterization.
本研究调查了机器学习在预测聚乳酸-乙醇酸共聚物(PLGA)纳米颗粒尺寸回归任务中的应用。在构建模型时考虑了包括类别和数值在内的各种输入,以预测用于药物递送应用的纳米级PLGA颗粒的最佳制备条件。所提出的方法采用留一法(LOO)进行分类特征转换,局部离群因子(LOF)进行离群值检测,以及蝙蝠优化算法(BA)进行超参数优化。一项对比分析比较了K近邻算法(KNN)、Bagging和自适应增强(AdaBoost)等集成方法,以及新颖的小尺寸蝙蝠优化KNN回归(SBNNR)模型,该模型使用生成对抗网络和深度特征提取来提高在稀疏数据集上的性能。结果表明,ADA-KNN在粒径预测方面优于其他模型,测试R²为0.94385,而SBNNR在预测zeta电位方面具有更高的准确性,测试R²为0.97674。这些发现强调了结合先进的预处理、优化和集成技术进行稳健回归建模的有效性。这项工作的贡献包括SBNNR模型的开发、BA优化能力的验证以及对集成方法的全面评估。该方法为在材料科学应用中,特别是纳米颗粒表征中使用机器学习提供了一个可靠的框架。