Ma Jingtai, Fang Yiting, Li Shiqi, Zeng Lilian, Chen Siyi, Li Zhifeng, Ji Guiyuan, Yang Xingfen, Wu Wei
National Medical Products Administration (NMPA) Key Laboratory for Safety Evaluation of Cosmetics, Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China.
Guangdong Provincial Institute of Public Health, Guangdong Provincial Center for Disease Control and Prevention, Guangzhou, China.
Front Immunol. 2025 May 1;16:1528046. doi: 10.3389/fimmu.2025.1528046. eCollection 2025.
The "gut-skin axis" has been proposed to play an important role in the development and symptoms of atopic dermatitis. Therefore, we have constructed an interpretable machine learning framework to quantitatively screen key gut flora.
The 16S rRNA dataset, after applying the centered log-ratio transformation, was analyzed using five different machine learning models: random forest, light gradient boosting machine, extreme gradient boosting, support vector machine with radial kernel, and logistic regression. Interpretable machine learning methods, such as SHAP values, were used to identify significant features associated with atopic dermatitis.
Random forest performed better than the other "tree" models in the validation partitions. The SHAP global dependency plot indicated that ranked as the strongest predictive factor across all prediction horizons, although the SHAP values for some features were still higher in support vector machine and logistic regression models. The SHAP partial dependency plot for "tree" models showed that the best segmentation point for was further from the origin compared to other features in the respective models, quantitatively reflecting differences in gut microbiota.
Machine learning models combined with SHAP could be used to quantitatively screen key gut flora in atopic dermatitis patients, providing doctors with an intuitive understanding of 16S rRNA sequencing data to support precision medicine in care and recovery.
“肠-皮肤轴”被认为在特应性皮炎的发生发展及症状表现中起重要作用。因此,我们构建了一个可解释的机器学习框架来定量筛选关键肠道菌群。
对经过中心对数比变换的16S rRNA数据集,使用五种不同的机器学习模型进行分析:随机森林、轻梯度提升机、极端梯度提升、径向核支持向量机和逻辑回归。使用诸如SHAP值等可解释的机器学习方法来识别与特应性皮炎相关的显著特征。
在验证分区中,随机森林的表现优于其他“树”模型。SHAP全局依赖图表明,在所有预测范围内, 被列为最强的预测因子,尽管在支持向量机和逻辑回归模型中某些特征的SHAP值仍然更高。“树”模型的SHAP局部依赖图显示,与各自模型中的其他特征相比, 的最佳分割点离原点更远,定量反映了肠道微生物群的差异。
机器学习模型与SHAP相结合可用于定量筛选特应性皮炎患者的关键肠道菌群,为医生提供对16S rRNA测序数据的直观理解,以支持精准医疗中的护理和康复。