Martorell-Marugán Jordi, Chierici Marco, Jurman Giuseppe, Alarcón-Riquelme Marta E, Carmona-Sáez Pedro
Department of Statistics and OR, University of Granada, Granada, 18071, Spain; Data Science for Health Research Unit, Fondazione Bruno Kessler, Trento, 38123, Italy; Bioinformatics Unit, GENYO. Centre for Genomics and Oncological Research: Pfizer / University of Granada / Andalusian Regional Government, PTS Granada, Granada, 18016, Spain.
Data Science for Health Research Unit, Fondazione Bruno Kessler, Trento, 38123, Italy.
Comput Biol Med. 2023 Jan;152:106373. doi: 10.1016/j.compbiomed.2022.106373. Epub 2022 Nov 28.
Systemic lupus erythematosus and primary Sjogren's syndrome are complex systemic autoimmune diseases that are often misdiagnosed. In this article, we demonstrate the potential of machine learning to perform differential diagnosis of these similar pathologies using gene expression and methylation data from 651 individuals. Furthermore, we analyzed the impact of the heterogeneity of these diseases on the performance of the predictive models, discovering that patients assigned to a specific molecular cluster are misclassified more often and affect to the overall performance of the predictive models. In addition, we found that the samples characterized by a high interferon activity are the ones predicted with more accuracy, followed by the samples with high inflammatory activity. Finally, we identified a group of biomarkers that improve the predictions compared to using the whole data and we validated them with external studies from other tissues and technological platforms.
系统性红斑狼疮和原发性干燥综合征是复杂的系统性自身免疫性疾病,常常被误诊。在本文中,我们展示了机器学习利用来自651名个体的基因表达和甲基化数据对这些相似病症进行鉴别诊断的潜力。此外,我们分析了这些疾病的异质性对预测模型性能的影响,发现被归入特定分子簇的患者被错误分类的频率更高,这会影响预测模型的整体性能。另外,我们发现以高干扰素活性为特征的样本预测准确率更高,其次是具有高炎症活性的样本。最后,我们确定了一组生物标志物,与使用全部数据相比,这些生物标志物能改善预测效果,并且我们通过来自其他组织和技术平台的外部研究对其进行了验证。
Br Med J. 1959-2-21
Curr Opin Rheumatol. 1992-10
Brief Bioinform. 2024-11-22
Lupus Sci Med. 2024-3-4
PLoS Comput Biol. 2023-7
Int J Mol Sci. 2023-2-24