Kourou Konstantina D, Pezoulas Vasileios C, Georga Eleni I, Exarchos Themis, Papaloukas Costas, Voulgarelis Michalis, Goules Andreas, Nezos Andrianos, Tzioufas Athanasios G, Moutsopoulos Earalampos M, Mavragani Clio, Fotiadis Dimitrios I
1 Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and EngineeringThe University of Ioannina GR45110 Ioannina Greece.
2 Department of Biological Applications and TechnologyThe University of Ioannina GR45110 Ioannina Greece.
IEEE Open J Eng Med Biol. 2020 Feb 14;1:49-56. doi: 10.1109/OJEMB.2020.2965191. eCollection 2020.
Lymphoma development constitutes one of the most serious clinico-pathological manifestations of patients with Sjögren's Syndrome (SS). Over the last decades the risk for lymphomagenesis in SS patients has been studied aiming to identify novel biomarkers and risk factors predicting lymphoma development in this patient population. The current study aims to explore whether genetic susceptibility profiles of SS patients along with known clinical, serological and histological risk factors enhance the accuracy of predicting lymphoma development in this patient population. The potential predicting role of both genetic variants, clinical and laboratory risk factors were investigated through a Machine Learning-based (ML) framework which encapsulates ensemble classifiers. : Ensemble methods empower the classification accuracy with approaches which are sensitive to minor perturbations in the training phase. The evaluation of the proposed methodology based on a 10-fold stratified cross validation procedure yielded considerable results in terms of balanced accuracy (GB: 0.7780 ± 0.1514, RF Gini: 0.7626 ± 0.1787, RF Entropy: 0.7590 ± 0.1837). The initial clinical, serological, histological and genetic findings at an early diagnosis have been exploited in an attempt to establish predictive tools in clinical practice and further enhance our understanding towards lymphoma development in SS.
淋巴瘤的发生是干燥综合征(SS)患者最严重的临床病理表现之一。在过去几十年里,人们对SS患者发生淋巴瘤的风险进行了研究,旨在识别预测该患者群体淋巴瘤发生的新生物标志物和风险因素。本研究旨在探讨SS患者的遗传易感性特征以及已知的临床、血清学和组织学风险因素是否能提高预测该患者群体淋巴瘤发生的准确性。通过一个基于机器学习(ML)的框架,该框架包含集成分类器,研究了基因变异、临床和实验室风险因素的潜在预测作用。集成方法通过对训练阶段微小扰动敏感的方法提高分类准确率。基于10倍分层交叉验证程序对所提出方法的评估在平衡准确率方面产生了可观的结果(梯度提升:0.7780±0.1514,随机森林基尼系数:0.7626±0.1787,随机森林熵:0.7590±0.1837)。早期诊断时的初始临床、血清学、组织学和遗传学发现已被用于尝试在临床实践中建立预测工具,并进一步加深我们对SS患者淋巴瘤发生的理解。