Suppr超能文献

基于机器学习的预测类风湿关节炎与 ACPA 自身抗体在非 HLA 基因多态性存在下的发展。

Machine learning-based prediction of rheumatoid arthritis with development of ACPA autoantibodies in the presence of non-HLA genes polymorphisms.

机构信息

Electrical Engineering Faculty, Czestochowa University of Technology, Czestochowa, Poland.

Faculty of Mathematics and Computer Science, University of Lodz, Lodz, Poland.

出版信息

PLoS One. 2024 Mar 22;19(3):e0300717. doi: 10.1371/journal.pone.0300717. eCollection 2024.

Abstract

Machine learning (ML) algorithms can handle complex genomic data and identify predictive patterns that may not be apparent through traditional statistical methods. They become popular tools for medical applications including prediction, diagnosis or treatment of complex diseases like rheumatoid arthritis (RA). RA is an autoimmune disease in which genetic factors play a major role. Among the most important genetic factors predisposing to the development of this disease and serving as genetic markers are HLA-DRB and non-HLA genes single nucleotide polymorphisms (SNPs). Another marker of RA is the presence of anticitrullinated peptide antibodies (ACPA) which is correlated with severity of RA. We use genetic data of SNPs in four non-HLA genes (PTPN22, STAT4, TRAF1, CD40 and PADI4) to predict the occurrence of ACPA positive RA in the Polish population. This work is a comprehensive comparative analysis, wherein we assess and juxtapose various ML classifiers. Our evaluation encompasses a range of models, including logistic regression, k-nearest neighbors, naïve Bayes, decision tree, boosted trees, multilayer perceptron, and support vector machines. The top-performing models demonstrated closely matched levels of accuracy, each distinguished by its particular strengths. Among these, we highly recommend the use of a decision tree as the foremost choice, given its exceptional performance and interpretability. The sensitivity and specificity of the ML models is about 70% that are satisfying. In addition, we introduce a novel feature importance estimation method characterized by its transparent interpretability and global optimality. This method allows us to thoroughly explore all conceivable combinations of polymorphisms, enabling us to pinpoint those possessing the highest predictive power. Taken together, these findings suggest that non-HLA SNPs allow to determine the group of individuals more prone to develop RA rheumatoid arthritis and further implement more precise preventive approach.

摘要

机器学习 (ML) 算法可以处理复杂的基因组数据,并识别通过传统统计方法可能不明显的预测模式。它们成为医学应用的流行工具,包括预测、诊断或治疗类风湿关节炎 (RA) 等复杂疾病。RA 是一种自身免疫性疾病,遗传因素起主要作用。在导致这种疾病发展并作为遗传标记的最重要遗传因素中,有 HLA-DRB 和非 HLA 基因单核苷酸多态性 (SNP)。RA 的另一个标志物是存在抗瓜氨酸化肽抗体 (ACPA),这与 RA 的严重程度相关。我们使用四个非 HLA 基因 (PTPN22、STAT4、TRAF1、CD40 和 PADI4) 的 SNP 遗传数据来预测波兰人群中 ACPA 阳性 RA 的发生。这项工作是一项全面的比较分析,我们评估并并列了各种 ML 分类器。我们的评估涵盖了一系列模型,包括逻辑回归、k-最近邻、朴素贝叶斯、决策树、提升树、多层感知机和支持向量机。表现最佳的模型显示出非常接近的准确性水平,每个模型都有其独特的优势。在这些模型中,我们强烈推荐使用决策树作为首选,因为它具有出色的性能和可解释性。ML 模型的灵敏度和特异性约为 70%,令人满意。此外,我们引入了一种新的特征重要性估计方法,其特点是透明的可解释性和全局最优性。该方法使我们能够彻底探索所有可能的多态性组合,从而确定那些具有最高预测能力的组合。总之,这些发现表明非 HLA SNP 允许确定更易患 RA 类风湿关节炎的个体组,并进一步实施更精确的预防方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19c3/10959370/5258ee9fcd83/pone.0300717.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验