Suppr超能文献

AITeQ:使用独特的五基因特征进行阿尔茨海默病预测的机器学习框架。

AITeQ: a machine learning framework for Alzheimer's prediction using a distinctive five-gene signature.

机构信息

Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka 1349, Bangladesh.

Department of Biochemistry and Microbiology, North South University, Bashundhara, Dhaka 1229, Bangladesh.

出版信息

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae291.

Abstract

Neurodegenerative diseases, such as Alzheimer's disease, pose a significant global health challenge with their complex etiology and elusive biomarkers. In this study, we developed the Alzheimer's Identification Tool (AITeQ) using ribonucleic acid-sequencing (RNA-seq), a machine learning (ML) model based on an optimized ensemble algorithm for the identification of Alzheimer's from RNA-seq data. Analysis of RNA-seq data from several studies identified 87 differentially expressed genes. This was followed by a ML protocol involving feature selection, model training, performance evaluation, and hyperparameter tuning. The feature selection process undertaken in this study, employing a combination of four different methodologies, culminated in the identification of a compact yet impactful set of five genes. Twelve diverse ML models were trained and tested using these five genes (CNKSR1, EPHA2, CLSPN, OLFML3, and TARBP1). Performance metrics, including precision, recall, F1 score, accuracy, Matthew's correlation coefficient, and receiver operating characteristic area under the curve were assessed for the finally selected model. Overall, the ensemble model consisting of logistic regression, naive Bayes classifier, and support vector machine with optimized hyperparameters was identified as the best and was used to develop AITeQ. AITeQ is available at: https://github.com/ishtiaque-ahammad/AITeQ.

摘要

神经退行性疾病,如阿尔茨海默病,具有复杂的病因和难以捉摸的生物标志物,是全球健康的重大挑战。在这项研究中,我们使用基于 RNA 测序(RNA-seq)的机器学习(ML)模型开发了阿尔茨海默病识别工具(AITeQ),该模型基于优化的集成算法,用于从 RNA-seq 数据中识别阿尔茨海默病。对来自多个研究的 RNA-seq 数据的分析确定了 87 个差异表达基因。然后,我们采用了一种包含特征选择、模型训练、性能评估和超参数调整的 ML 协议。本研究采用了四种不同方法的组合进行特征选择,最终确定了一组紧凑而有影响力的五个基因。使用这五个基因(CNKSR1、EPHA2、CLSPN、OLFML3 和 TARBP1)训练和测试了 12 个不同的 ML 模型。最后选择的模型评估了精度、召回率、F1 分数、准确性、马修相关系数和接收器操作特征曲线下的面积等性能指标。总体而言,具有优化超参数的逻辑回归、朴素贝叶斯分类器和支持向量机的集成模型被确定为最佳模型,并用于开发 AITeQ。AITeQ 可在以下网址获得:https://github.com/ishtiaque-ahammad/AITeQ。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63e5/11179120/8fe59606117b/bbae291f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验