Alkhanani Mustfa Faisal
Biology Department, College of Science, University of Hafr Al Batin, Hafr Al Batin, Saudi Arabia.
PLoS One. 2025 May 30;20(5):e0324827. doi: 10.1371/journal.pone.0324827. eCollection 2025.
Accurately predicting the refractive index of hemoglobin across various wavelengths and concentrations is critical for advancing optical diagnostic techniques in biological and clinical applications. This study introduces a predictive model based on Gaussian Process Regression (GPR) to estimate the refractive index of hemoglobin in both oxygenated and deoxygenated states, covering wavelengths from 400 to 700 nm and concentrations ranging from 0 to 140 g/L. The GPR model effectively captures non-linear relationships, achieving high prediction accuracy with R2 values of 99.4% for the training dataset and 99.3% for the testing dataset. An independent external dataset was used to validate the model's robustness further, yielding an R2 value of 92.80%, RMSE of 0.0042, and MSE of 1.77 × 10 ⁻ ⁵, demonstrating the model's strong generalizability. To enhance interpretability, Partial Dependence Plots (PDPs) were employed to visualize the influence of wavelength and concentration on refractive index predictions, offering clear insights into hemoglobin's optical behavior. The model's ability to provide accurate and interpretable predictions has significant implications for improving the reliability of biophotonic diagnostic tools, such as optical coherence tomography and reflectance spectroscopy, in clinical settings. By combining machine learning with interpretability techniques, this study advances the understanding of hemoglobin's optical properties and sets a benchmark for predictive modeling in biomedical optics, paving the way for more precise and dependable diagnostic applications.
准确预测血红蛋白在不同波长和浓度下的折射率对于推进生物和临床应用中的光学诊断技术至关重要。本研究引入了一种基于高斯过程回归(GPR)的预测模型,用于估计氧合和脱氧状态下血红蛋白的折射率,涵盖400至700nm的波长范围和0至140g/L的浓度范围。GPR模型有效地捕捉了非线性关系,训练数据集的R2值为99.4%,测试数据集的R2值为99.3%,实现了较高的预测精度。使用一个独立的外部数据集进一步验证了模型的稳健性,得到的R2值为92.80%,均方根误差(RMSE)为0.0042,均方误差(MSE)为1.77×10⁻⁵,证明了该模型具有很强的泛化能力。为了增强可解释性,采用了偏依赖图(PDP)来可视化波长和浓度对折射率预测的影响,从而清晰地洞察血红蛋白的光学行为。该模型提供准确且可解释预测的能力对于提高生物光子诊断工具(如光学相干断层扫描和反射光谱)在临床环境中的可靠性具有重要意义。通过将机器学习与可解释性技术相结合,本研究加深了对血红蛋白光学特性的理解,并为生物医学光学中的预测建模设定了基准,为更精确和可靠的诊断应用铺平了道路。