Elkhoudary Mahmoud M, Marie Aya A, Hammad Sherin F, Salim Mohamed M, Kamal Amira H
Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Horus University-Egypt, New Damietta, 34517, Egypt.
Department of Pharmaceutical Analytical Chemistry, Faculty of Pharmacy, Tanta University, Tanta, 31527, Egypt.
BMC Chem. 2024 Nov 29;18(1):237. doi: 10.1186/s13065-024-01351-8.
This study represents a comparison among the performances of four multivariate procedures: partial least square (PLS) and artificial neural networks (ANN) in addition to support vector regression (SVR) and extreme gradient boosting (XG Boost) algorithm for the determination of the anti-diabetic mixture of pioglitazone (PIO), alogliptin (ALG) and glimepiride (GLM) in pharmaceutical formulations with aid of UV spectrometry. Key wavelengths were selected using knowledge-based variable selection and various preprocessing methods (e.g., mean centering, orthogonal scatter correction, and principal component analysis) to minimize noise and improve model precision. XG Boost effectively enhanced computing speed and accuracy by focusing on specific spectral features rather than the entire spectrum, demonstrating its advantages in resolving complex, overlapping spectral data. The independent test results of different models demonstrated that XG Boost outperformed other methods. XG Boost achieved the lowest root mean squared error of prediction (RMSEP) and standard deviation (SD) values across all compounds, indicating minimal prediction error and variability. For PIO, XG Boost recorded an RMSEP of 0.100 and SD of 0.369, significantly better than PLS and ANN. For ALG, XG Boost showed near-perfect performance with an RMSEP of 0.001 and SD of 0.005, outperforming SVR and PLS, which had higher error rates. In the case of GLM, XG Boost also excelled with an RMSEP of 0.001 and SD of 0.018, demonstrating superior precision compared to the much higher errors seen in PLS and ANN. These results highlight XG Boost's exceptional ability to handle complex, overlapping spectral data, making it the most reliable and accurate model in this study.
偏最小二乘法(PLS)、人工神经网络(ANN)、支持向量回归(SVR)以及极端梯度提升(XG Boost)算法,旨在借助紫外光谱法测定药物制剂中吡格列酮(PIO)、阿格列汀(ALG)和格列美脲(GLM)的抗糖尿病混合物。利用基于知识的变量选择和各种预处理方法(如均值中心化、正交散射校正和主成分分析)来选择关键波长,以减少噪声并提高模型精度。XG Boost通过关注特定光谱特征而非整个光谱,有效提高了计算速度和准确性,显示出其在解析复杂重叠光谱数据方面的优势。不同模型的独立测试结果表明,XG Boost的性能优于其他方法。XG Boost在所有化合物中均实现了最低的预测均方根误差(RMSEP)和标准差(SD)值,表明预测误差和变异性最小。对于PIO,XG Boost的RMSEP为0.100,SD为0.369,明显优于PLS和ANN。对于ALG,XG Boost表现近乎完美,RMSEP为0.001,SD为0.005,优于误差率较高的SVR和PLS。对于GLM,XG Boost同样表现出色,RMSEP为0.001,SD为0.018,与PLS和ANN中较高的误差相比,显示出卓越的精度。这些结果突出了XG Boost处理复杂重叠光谱数据的卓越能力,使其成为本研究中最可靠、最准确的模型。