Chang Che-Cheng, Liu Tzu-Chi, Lu Chi-Jie, Chiu Hou-Chang, Lin Wei-Ning
PhD Program in Nutrition and Food Science, Fu Jen Catholic University, New Taipei City, Taiwan.
Department of Neurology, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City, Taiwan.
Front Microbiol. 2023 Sep 27;14:1227300. doi: 10.3389/fmicb.2023.1227300. eCollection 2023.
Myasthenia gravis (MG) is a neuromuscular junction disease with a complex pathophysiology and clinical variation for which no clear biomarker has been discovered. We hypothesized that because changes in gut microbiome composition often occur in autoimmune diseases, the gut microbiome structures of patients with MG would differ from those without, and supervised machine learning (ML) analysis strategy could be trained using data from gut microbiota for diagnostic screening of MG. Genomic DNA from the stool samples of MG and those without were collected and established a sequencing library by constructing amplicon sequence variants (ASVs) and completing taxonomic classification of each representative DNA sequence. Four ML methods, namely least absolute shrinkage and selection operator, extreme gradient boosting (XGBoost), random forest, and classification and regression trees with nested leave-one-out cross-validation were trained using ASV taxon-based data and full ASV-based data to identify key ASVs in each data set. The results revealed XGBoost to have the best predicted performance. Overlapping key features extracted when XGBoost was trained using the full ASV-based and ASV taxon-based data were identified, and 31 high-importance ASVs (HIASVs) were obtained, assigned importance scores, and ranked. The most significant difference observed was in the abundance of bacteria in the and families. The 31 HIASVs were used to train the XGBoost algorithm to differentiate individuals with and without MG. The model had high diagnostic classification power and could accurately predict and identify patients with MG. In addition, the abundance of was associated with limb weakness severity. In this study, we discovered that the composition of gut microbiomes differed between MG and non-MG subjects. In addition, the proposed XGBoost model trained using 31 HIASVs had the most favorable performance with respect to analyzing gut microbiomes. These HIASVs selected by the ML model may serve as biomarkers for clinical use and mechanistic study in the future. Our proposed ML model can identify several taxonomic markers and effectively discriminate patients with MG from those without with a high accuracy, the ML strategy can be applied as a benchmark to conduct noninvasive screening of MG.
重症肌无力(MG)是一种神经肌肉接头疾病,其病理生理学复杂,临床表现多样,尚未发现明确的生物标志物。我们推测,由于自身免疫性疾病中肠道微生物群组成常发生变化,MG患者的肠道微生物群结构会与非MG患者不同,且可使用来自肠道微生物群的数据训练监督机器学习(ML)分析策略,用于MG的诊断筛查。收集MG患者和非MG患者粪便样本中的基因组DNA,通过构建扩增子序列变体(ASV)并完成每个代表性DNA序列的分类学分类,建立测序文库。使用基于ASV分类单元的数据和基于完整ASV的数据,对最小绝对收缩和选择算子、极端梯度提升(XGBoost)、随机森林以及带嵌套留一法交叉验证的分类与回归树这四种ML方法进行训练,以识别每个数据集中的关键ASV。结果显示XGBoost具有最佳预测性能。确定了在使用基于完整ASV的数据和基于ASV分类单元的数据训练XGBoost时提取的重叠关键特征,获得了31个高重要性ASV(HIASV),为其赋予重要性分数并进行排名。观察到的最显著差异在于[具体菌属]科和[具体菌属]科细菌的丰度。使用这31个HIASV训练XGBoost算法以区分有无MG的个体。该模型具有较高的诊断分类能力,能够准确预测和识别MG患者。此外,[某种菌属]的丰度与肢体无力严重程度相关。在本研究中,我们发现MG患者和非MG患者的肠道微生物群组成不同。此外,使用31个HIASV训练的XGBoost模型在分析肠道微生物群方面表现最为出色。这些由ML模型选择的HIASV未来可能作为临床应用和机制研究的生物标志物。我们提出的ML模型可以识别多个分类学标志物,并能高精度地有效区分MG患者和非MG患者,该ML策略可作为MG无创筛查的基准。