Gan Minzhi, Peng Yong, Ying Ying, Zhang Keyue, Chen Yong
Department of Rheumatology and Immunology, Ningbo NO.2 Hospital, Ningbo, Zhejiang, China.
Front Immunol. 2025 Jul 8;16:1614631. doi: 10.3389/fimmu.2025.1614631. eCollection 2025.
This study aimed to evaluate the utility of machine learning algorithms in differentiating rheumatoid arthritis-Sjögren's syndrome overlap (RA-SS) from Sjögren's syndrome with polyarthritis (SS-PA), and to identify key factors influencing diagnostic differentiation.
This retrospective analysis included 106 RA-SS and 135 SS-PA patients randomized 7:3 into training and validation sets. Clinical, laboratory, and radiographic data were collected. Least Absolute Shrinkage and Selection Operator (LASSO) regression facilitated feature selection before constructing diagnostic models using four machine learning algorithms, with feature importance quantified through SHapley Additive exPlanations (SHAP).
The random forest algorithm demonstrated superior performance (AUC=0.854, 95% CI: 0.747-0.944) compared to other machine learning algorithms. SHAP analysis identified anti-CCP level, rheumatoid factor (RF) level, erosive joint count, anti-SSA/Ro60 antibodies, and C-reactive protein (CRP) as critical discriminating factors between RA-SS and SS-PA.
The random forest algorithm demonstrates promising clinical potential for RA-SS and SS-PA differential diagnosis, with diagnostic efficiency surpassing traditional logistic regression (LR), offering a new approach for clinical differentiation.
本研究旨在评估机器学习算法在区分类风湿关节炎-干燥综合征重叠症(RA-SS)和多关节炎型干燥综合征(SS-PA)中的效用,并确定影响诊断鉴别的关键因素。
这项回顾性分析纳入了106例RA-SS患者和135例SS-PA患者,按7:3随机分为训练集和验证集。收集了临床、实验室和影像学数据。在使用四种机器学习算法构建诊断模型之前,采用最小绝对收缩和选择算子(LASSO)回归进行特征选择,并通过夏普利值附加解释(SHAP)对特征重要性进行量化。
与其他机器学习算法相比,随机森林算法表现出更优的性能(AUC=0.854,95%CI:0.747-0.944)。SHAP分析确定抗环瓜氨酸肽(anti-CCP)水平、类风湿因子(RF)水平、侵蚀性关节计数、抗SSA/Ro60抗体和C反应蛋白(CRP)是RA-SS和SS-PA之间的关键鉴别因素。
随机森林算法在RA-SS和SS-PA的鉴别诊断中显示出良好的临床潜力,其诊断效率超过传统逻辑回归(LR),为临床鉴别提供了一种新方法。