Nguyen Phong B H, Garger Daniel, Lu Diyuan, Maalmi Haifa, Prokisch Holger, Thorand Barbara, Adamski Jerzy, Kastenmüller Gabi, Waldenberger Melanie, Gieger Christian, Peters Annette, Suhre Karsten, Bönhof Gidon J, Rathmann Wolfgang, Roden Michael, Grallert Harald, Ziegler Dan, Herder Christian, Menden Michael P
Institute of Computational Biology, Helmholtz Munich, 85764, Neuherberg, Germany.
Faculty of Biology, Ludwig-Maximilians University Munich, 82152, Martinsried, Germany.
Commun Med (Lond). 2024 Dec 16;4(1):265. doi: 10.1038/s43856-024-00637-1.
Distal sensorimotor polyneuropathy (DSPN) is a common neurological disorder in elderly adults and people with obesity, prediabetes and diabetes and is associated with high morbidity and premature mortality. DSPN is a multifactorial disease and not fully understood yet.
Here, we developed the Interpretable Multimodal Machine Learning (IMML) framework for predicting DSPN prevalence and incidence based on sparse multimodal data. Exploiting IMMLs interpretability further empowered biomarker identification. We leveraged the population-based KORA F4/FF4 cohort including 1091 participants and their deep multimodal characterisation, i.e. clinical data, genomics, methylomics, transcriptomics, proteomics, inflammatory proteins and metabolomics.
Clinical data alone is sufficient to stratify individuals with and without DSPN (AUROC = 0.752), whilst predicting DSPN incidence 6.5 ± 0.2 years later strongly benefits from clinical data complemented with two or more molecular modalities (improved ΔAUROC > 0.1, achieved AUROC of 0.714). Important and interpretable features of incident DSPN prediction include up-regulation of proinflammatory cytokines, down-regulation of SUMOylation pathway and essential fatty acids, thus yielding novel insights in the disease pathophysiology.
These may become biomarkers for incident DSPN, guide prevention strategies and serve as proof of concept for the utility of IMML in studying complex diseases.
远端感觉运动性多发性神经病(DSPN)是老年人以及肥胖、糖尿病前期和糖尿病患者中常见的神经系统疾病,与高发病率和过早死亡率相关。DSPN是一种多因素疾病,目前尚未完全了解。
在此,我们开发了可解释多模态机器学习(IMML)框架,用于基于稀疏多模态数据预测DSPN的患病率和发病率。利用IMML的可解释性进一步助力生物标志物识别。我们利用了基于人群的KORA F4/FF4队列,其中包括1091名参与者及其深度多模态特征,即临床数据、基因组学、甲基化组学、转录组学、蛋白质组学、炎症蛋白和代谢组学。
仅临床数据就足以对患有和未患有DSPN的个体进行分层(曲线下面积[AUC] = 0.752),而预测6.5±0.2年后的DSPN发病率则极大地受益于补充了两种或更多分子模态的临床数据(ΔAUC改善>0.1,AUC达到0.714)。新发DSPN预测的重要且可解释的特征包括促炎细胞因子上调、SUMO化途径和必需脂肪酸下调,从而为疾病病理生理学提供了新见解。
这些可能成为新发DSPN的生物标志物,指导预防策略,并作为IMML在研究复杂疾病中的实用性的概念验证。