Chiang Pei-Jhang, Tsao Chih-Wei, Jhuo Yu-Cing, Chu Ta-Wei, Pei Dee, Kuo Shi-Wen
Division of Urology, Department of Surgery, Tri-Service General Hospital, National Defense Medical University, Taipei 114202, Taiwan.
In-Service Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 110301, Taiwan.
Biomedicines. 2025 Jul 24;13(8):1816. doi: 10.3390/biomedicines13081816.
: Homocysteine (Hcy) is a sulfur-containing amino acid crucial for various physiological processes, with elevated levels linked to cardiovascular and neurological adverse conditions. Various factors contribute to high Hcy, and past studies of impact factors relied on traditional statistical methods. Recently, machine learning (ML) techniques have greatly improved and are now widely applied in medical research. This study used four ML methods to identify key factors influencing Hcy in healthy elderly Taiwanese men, comparing their accuracy using multiple linear regression (MLR). The study seeks to improve Hcy prediction accuracy and provide insights into relevant impact factors. : A total of 468 healthy elderly men were studied in terms of 33 parameters using four ML methods: random forest (RF), stochastic gradient boosting (SGB), eXtreme gradient boosting (XGBoost), and elastic net (EN). MLR served as a benchmark. Model performance was assessed using SMAPE, RAE, RRSE, and RMSE. : All ML methods demonstrated lower prediction errors than MLR, indicating higher accuracy. By averaging the importance scores from the four ML models, C-reactive protein (CRP) emerged as the leading impact factor for Hcy, followed by GPT, WBC, LDH, eGFR, and sport volume (SV). : Machine learning methods outperformed MLR in predicting Hcy levels in healthy elderly Taiwanese men. CRP was identified as the most crucial factor, followed by GPT/ALT, WBC, LDH, and eGFR.
同型半胱氨酸(Hcy)是一种含硫氨基酸,对各种生理过程至关重要,其水平升高与心血管和神经方面的不良状况有关。多种因素导致高Hcy水平,过去对影响因素的研究依赖于传统统计方法。最近,机器学习(ML)技术有了很大改进,目前已广泛应用于医学研究。本研究使用四种ML方法来识别影响台湾健康老年男性Hcy的关键因素,并使用多元线性回归(MLR)比较它们的准确性。该研究旨在提高Hcy预测准确性,并深入了解相关影响因素。
共有468名健康老年男性接受研究,涉及33个参数,使用四种ML方法:随机森林(RF)、随机梯度提升(SGB)、极端梯度提升(XGBoost)和弹性网络(EN)。MLR作为基准。使用对称平均绝对百分比误差(SMAPE)、相对绝对误差(RAE)、相对平方根误差(RRSE)和均方根误差(RMSE)评估模型性能。
所有ML方法的预测误差均低于MLR,表明准确性更高。通过平均四个ML模型的重要性得分,C反应蛋白(CRP)成为Hcy的主要影响因素,其次是谷丙转氨酶(GPT)、白细胞(WBC)、乳酸脱氢酶(LDH)、估算肾小球滤过率(eGFR)和运动量(SV)。
在预测台湾健康老年男性的Hcy水平方面,机器学习方法优于MLR。CRP被确定为最关键因素,其次是GPT/ALT、WBC、LDH和eGFR。