Suppr超能文献

使用机器学习和多组学数据整合进行肌萎缩侧索硬化症诊断

Amyotrophic lateral sclerosis diagnosis using machine learning and multi-omic data integration.

作者信息

Nikafshan Rad Hima, Su Zheng, Trinh Anne, Hakim Newton M A, Shamsani Jannah, Karim Abdul, Sattar Abdul

机构信息

School of Information and Communication Technology, Griffith University, 170 Kessels Rd, Nathan, Brisbane, 4111, QLD, Australia.

GenieUs Genomics Pty Ltd, Sydney, 2000, NSW, Australia.

出版信息

Heliyon. 2024 Oct 1;10(20):e38583. doi: 10.1016/j.heliyon.2024.e38583. eCollection 2024 Oct 30.

Abstract

Amyotrophic Lateral Sclerosis (ALS) is a complex and rare neurodegenerative disorder characterized by significant genetic, molecular, and clinical heterogeneity. Despite numerous endeavors to discover the genetic factors underlying ALS, a significant number of these factors remain unknown. This knowledge gap highlights the necessity for personalized medicine approaches that can provide more comprehensive information for the purposes of diagnosis, prognosis, and treatment of ALS. This work utilizes an innovative approach by employing a machine learning-facilitated, multi-omic model to develop a more comprehensive knowledge of ALS. Through unsupervised clustering on gene expression profiles, 9,847 genes associated with ALS pathways are isolated and integrated with 7,699 genes containing rare, presumed pathogenic genomic variants, leading to a comprehensive amalgamation of 17,546 genes. Subsequently, a Variational Autoencoder is applied to distil complex biomedical information from these genes, culminating in the creation of the proposed Multi-Omics for ALS (MOALS) model, which has been designed to expose intricate genotype-phenotype interconnections within the dataset. Our meticulous investigation elucidates several pivotal ALS signaling pathways and demonstrates that MOALS is a superior model, outclassing other machine learning models based on single omic approaches such as SNV and RNA expression, enhancing accuracy by 1.7 percent and 6.2 percent, respectively. The findings of this study suggest that analyzing the relationships within biological systems can provide heuristic insights into the biological mechanisms that help to make highly accurate ALS diagnosis tools and achieve more interpretable results.

摘要

肌萎缩侧索硬化症(ALS)是一种复杂且罕见的神经退行性疾病,其特征在于显著的遗传、分子和临床异质性。尽管人们为发现ALS潜在的遗传因素付出了诸多努力,但仍有大量此类因素尚不明确。这一知识空白凸显了个性化医疗方法的必要性,这类方法可为ALS的诊断、预后和治疗提供更全面的信息。这项研究采用了一种创新方法,即利用机器学习辅助的多组学模型,以更全面地了解ALS。通过对基因表达谱进行无监督聚类,分离出9847个与ALS通路相关的基因,并与7699个包含罕见的、推测具有致病性的基因组变异的基因整合,从而实现了17546个基因的全面融合。随后,应用变分自编码器从这些基因中提取复杂的生物医学信息,最终创建了所提出的ALS多组学(MOALS)模型,该模型旨在揭示数据集中复杂的基因型-表型相互联系。我们的详细研究阐明了几个关键的ALS信号通路,并证明MOALS是一个优越的模型,优于基于单一组学方法(如单核苷酸变异和RNA表达)的其他机器学习模型,分别将准确率提高了1.7%和6.2%。这项研究的结果表明,分析生物系统内的关系可以为生物机制提供启发式见解,有助于打造高度准确的ALS诊断工具并获得更具可解释性的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/623f/11619964/b81ce4697ed2/gr001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验