• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

随机森林整合分析 AD 和晚期大脑转录组全数据,以鉴定疾病特异性基因表达。

Random forest-integrated analysis in AD and LATE brain transcriptome-wide data to identify disease-specific gene expression.

机构信息

University of Kentucky, Lexington, Kentucky, United States of America.

Qingdao University, Qingdao, Shandong, China.

出版信息

PLoS One. 2021 Sep 7;16(9):e0256648. doi: 10.1371/journal.pone.0256648. eCollection 2021.

DOI:10.1371/journal.pone.0256648
PMID:34492068
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8423259/
Abstract

Alzheimer's disease (AD) is a complex neurodegenerative disorder that affects thinking, memory, and behavior. Limbic-predominant age-related TDP-43 encephalopathy (LATE) is a recently identified common neurodegenerative disease that mimics the clinical symptoms of AD. The development of drugs to prevent or treat these neurodegenerative diseases has been slow, partly because the genes associated with these diseases are incompletely understood. A notable hindrance from data analysis perspective is that, usually, the clinical samples for patients and controls are highly imbalanced, thus rendering it challenging to apply most existing machine learning algorithms to directly analyze such datasets. Meeting this data analysis challenge is critical, as more specific disease-associated gene identification may enable new insights into underlying disease-driving mechanisms and help find biomarkers and, in turn, improve prospects for effective treatment strategies. In order to detect disease-associated genes based on imbalanced transcriptome-wide data, we proposed an integrated multiple random forests (IMRF) algorithm. IMRF is effective in differentiating putative genes associated with subjects having LATE and/or AD from controls based on transcriptome-wide data, thereby enabling effective discrimination between these samples. Various forms of validations, such as cross-domain verification of our method over other datasets, improved and competitive classification performance by using identified genes, effectiveness of testing data with a classifier that is completely independent from decision trees and random forests, and relationships with prior AD and LATE studies on the genes linked to neurodegeneration, all testify to the effectiveness of IMRF in identifying genes with altered expression in LATE and/or AD. We conclude that IMRF, as an effective feature selection algorithm for imbalanced data, is promising to facilitate the development of new gene biomarkers as well as targets for effective strategies of disease prevention and treatment.

摘要

阿尔茨海默病(AD)是一种复杂的神经退行性疾病,影响思维、记忆和行为。以边缘系统为主的与年龄相关的 TDP-43 脑病(LATE)是一种最近发现的常见神经退行性疾病,其临床症状类似于 AD。开发预防或治疗这些神经退行性疾病的药物进展缓慢,部分原因是与这些疾病相关的基因尚未完全了解。从数据分析的角度来看,一个显著的障碍是,通常情况下,患者和对照组的临床样本高度不平衡,因此,大多数现有的机器学习算法难以直接分析此类数据集。应对这一数据分析挑战至关重要,因为更具体的疾病相关基因的鉴定可能为潜在疾病驱动机制提供新的见解,并有助于寻找生物标志物,进而提高有效治疗策略的前景。为了基于不平衡的转录组范围数据检测疾病相关基因,我们提出了一种集成多个随机森林(IMRF)算法。IMRF 基于转录组范围的数据,在区分具有 LATE 和/或 AD 的受试者与对照的假定基因方面非常有效,从而能够有效区分这些样本。各种形式的验证,例如在其他数据集上对我们方法的跨域验证、使用鉴定的基因提高和竞争分类性能、使用完全独立于决策树和随机森林的分类器测试数据的有效性,以及与先前 AD 和 LATE 研究的关系,都证明了 IMRF 在识别 LATE 和/或 AD 中表达改变的基因方面的有效性。我们得出结论,IMRF 作为一种有效的不平衡数据特征选择算法,有望促进新的基因生物标志物的开发以及有效预防和治疗疾病策略的靶点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/7153409ee949/pone.0256648.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/ced3e0a8e898/pone.0256648.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/f43eca7348d4/pone.0256648.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/1bbc4bc285bb/pone.0256648.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/66d91f74c23a/pone.0256648.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/812cc225c4db/pone.0256648.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/e81cfc28c0d5/pone.0256648.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/bc771edeab7d/pone.0256648.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/7f4bd6fe9a7e/pone.0256648.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/fcf729e951d6/pone.0256648.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/d4f2b5364a22/pone.0256648.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/7153409ee949/pone.0256648.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/ced3e0a8e898/pone.0256648.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/f43eca7348d4/pone.0256648.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/1bbc4bc285bb/pone.0256648.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/66d91f74c23a/pone.0256648.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/812cc225c4db/pone.0256648.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/e81cfc28c0d5/pone.0256648.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/bc771edeab7d/pone.0256648.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/7f4bd6fe9a7e/pone.0256648.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/fcf729e951d6/pone.0256648.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/d4f2b5364a22/pone.0256648.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1332/8423259/7153409ee949/pone.0256648.g011.jpg

相似文献

1
Random forest-integrated analysis in AD and LATE brain transcriptome-wide data to identify disease-specific gene expression.随机森林整合分析 AD 和晚期大脑转录组全数据,以鉴定疾病特异性基因表达。
PLoS One. 2021 Sep 7;16(9):e0256648. doi: 10.1371/journal.pone.0256648. eCollection 2021.
2
Ensemble of random forests One vs. Rest classifiers for MCI and AD prediction using ANOVA cortical and subcortical feature selection and partial least squares.基于 ANOVA 皮质和皮质下特征选择和偏最小二乘法的随机森林与 One vs. Rest 分类器集成用于 MCI 和 AD 预测。
J Neurosci Methods. 2018 May 15;302:47-57. doi: 10.1016/j.jneumeth.2017.12.005. Epub 2017 Dec 11.
3
Random forest feature selection, fusion and ensemble strategy: Combining multiple morphological MRI measures to discriminate among healhy elderly, MCI, cMCI and alzheimer's disease patients: From the alzheimer's disease neuroimaging initiative (ADNI) database.随机森林特征选择、融合和集成策略:结合多种形态磁共振成像指标对健康老年人、MCI、cMCI 和阿尔茨海默病患者进行分类:来自阿尔茨海默病神经影像学倡议(ADNI)数据库。
J Neurosci Methods. 2018 May 15;302:14-23. doi: 10.1016/j.jneumeth.2017.12.010. Epub 2017 Dec 18.
4
Biomarker Extraction Based on Subspace Learning for the Prediction of Mild Cognitive Impairment Conversion.基于子空间学习的生物标志物提取用于预测轻度认知障碍的转化
Biomed Res Int. 2021 Sep 2;2021:5531940. doi: 10.1155/2021/5531940. eCollection 2021.
5
Classification of Alzheimer's disease and prediction of mild cognitive impairment-to-Alzheimer's conversion from structural magnetic resource imaging using feature ranking and a genetic algorithm.基于特征排序和遗传算法,利用结构磁共振成像对阿尔茨海默病进行分类及预测轻度认知障碍向阿尔茨海默病的转化
Comput Biol Med. 2017 Apr 1;83:109-119. doi: 10.1016/j.compbiomed.2017.02.011. Epub 2017 Feb 27.
6
Multimodal Data Analysis of Alzheimer's Disease Based on Clustering Evolutionary Random Forest.基于聚类进化随机森林的阿尔茨海默病多模态数据分析。
IEEE J Biomed Health Inform. 2020 Oct;24(10):2973-2983. doi: 10.1109/JBHI.2020.2973324. Epub 2020 Feb 11.
7
A Classification Algorithm by Combination of Feature Decomposition and Kernel Discriminant Analysis (KDA) for Automatic MR Brain Image Classification and AD Diagnosis.基于特征分解与核判别分析(KDA)组合的分类算法在自动磁共振脑图像分类与 AD 诊断中的应用。
Comput Math Methods Med. 2019 Dec 30;2019:1437123. doi: 10.1155/2019/1437123. eCollection 2019.
8
Alzheimer's disease diagnosis from diffusion tensor images using convolutional neural networks.基于卷积神经网络的弥散张量图像阿尔茨海默病诊断。
PLoS One. 2020 Mar 24;15(3):e0230409. doi: 10.1371/journal.pone.0230409. eCollection 2020.
9
An ensemble learning system for a 4-way classification of Alzheimer's disease and mild cognitive impairment.用于阿尔茨海默病和轻度认知障碍 4 分类的集成学习系统。
J Neurosci Methods. 2018 May 15;302:75-81. doi: 10.1016/j.jneumeth.2018.03.008. Epub 2018 Mar 22.
10
A Machine Learning-Based Holistic Approach to Predict the Clinical Course of Patients within the Alzheimer's Disease Spectrum.基于机器学习的阿尔茨海默病谱患者临床病程预测的整体方法。
J Alzheimers Dis. 2022;85(4):1639-1655. doi: 10.3233/JAD-210573.

引用本文的文献

1
An exploratory study of high-throughput transcriptomic analysis reveals novel mRNA biomarkers for acute myocardial infarction using integrated methods.一项高通量转录组分析的探索性研究采用综合方法揭示了急性心肌梗死的新型mRNA生物标志物。
Sci Rep. 2025 Mar 11;15(1):8436. doi: 10.1038/s41598-025-92757-4.
2
Mapping Knowledge Landscapes and Emerging Trends in AI for Dementia Biomarkers: Bibliometric and Visualization Analysis.痴呆生物标志物人工智能知识图谱与新兴趋势:文献计量与可视化分析
J Med Internet Res. 2024 Aug 8;26:e57830. doi: 10.2196/57830.
3
Transcriptome analysis of the Japanese eel (Anguilla japonica) during larval metamorphosis.

本文引用的文献

1
Harnessing the paradoxical phenotypes of APOE ɛ2 and APOE ɛ4 to identify genetic modifiers in Alzheimer's disease.利用 APOE ɛ2 和 APOE ɛ4 的矛盾表型鉴定阿尔茨海默病的遗传修饰因子。
Alzheimers Dement. 2021 May;17(5):831-846. doi: 10.1002/alz.12240. Epub 2020 Dec 7.
2
Limbic-predominant age-related TDP-43 encephalopathy differs from frontotemporal lobar degeneration.边缘系统为主的与年龄相关的 TDP-43 脑病不同于额颞叶变性。
Brain. 2020 Sep 1;143(9):2844-2857. doi: 10.1093/brain/awaa219.
3
Limbic Predominant Age-Related TDP-43 Encephalopathy (LATE): Clinical and Neuropathological Associations.
日本鳗鲡(Anguilla japonica)幼鱼变态过程中的转录组分析。
BMC Genomics. 2024 Jun 11;25(1):585. doi: 10.1186/s12864-024-10459-z.
4
Deep learning algorithm reveals probabilities of stage-specific time to conversion in individuals with neurodegenerative disease LATE.深度学习算法揭示了患有晚发性神经退行性疾病个体特定阶段转化时间的概率。
Alzheimers Dement (N Y). 2022 Nov 3;8(1):e12363. doi: 10.1002/trc2.12363. eCollection 2022.
5
Machine Learning Approach Predicts Probability of Time to Stage-Specific Conversion of Alzheimer's Disease.机器学习方法预测阿尔茨海默病特定阶段转化的时间概率。
J Alzheimers Dis. 2022;90(2):891-903. doi: 10.3233/JAD-220590.
6
Algorithmic Stability and Generalization of an Unsupervised Feature Selection Algorithm.一种无监督特征选择算法的算法稳定性与泛化能力
Adv Neural Inf Process Syst. 2021 Dec;34:19860-19875.
7
Risk Factors and Prediction Models for Nonalcoholic Fatty Liver Disease Based on Random Forest.基于随机森林的非酒精性脂肪性肝病的危险因素和预测模型。
Comput Math Methods Med. 2022 Aug 9;2022:8793659. doi: 10.1155/2022/8793659. eCollection 2022.
边缘为主型年龄相关性 TDP-43 脑病(LATE):临床和神经病理学关联。
J Neuropathol Exp Neurol. 2020 Mar 1;79(3):305-313. doi: 10.1093/jnen/nlz126.
4
A transcriptomic analysis of Nsmce1 overexpression in mouse hippocampal neuronal cell by RNA sequencing.通过 RNA 测序对小鼠海马神经元细胞中 Nsmce1 过表达的转录组分析。
Funct Integr Genomics. 2020 May;20(3):459-470. doi: 10.1007/s10142-019-00728-6. Epub 2019 Dec 2.
5
Sex differences in gene expression patterns associated with the allele.与该等位基因相关的基因表达模式中的性别差异。
F1000Res. 2019 Apr 5;8:387. doi: 10.12688/f1000research.18671.2. eCollection 2019.
6
Integrating Gene and Protein Expression Reveals Perturbed Functional Networks in Alzheimer's Disease.基因和蛋白质表达的整合揭示了阿尔茨海默病中功能网络的紊乱。
Cell Rep. 2019 Jul 23;28(4):1103-1116.e4. doi: 10.1016/j.celrep.2019.06.073.
7
Meta-Analysis of Gene Expression and Identification of Biological Regulatory Mechanisms in Alzheimer's Disease.阿尔茨海默病基因表达的荟萃分析及生物学调控机制的鉴定
Front Neurosci. 2019 Jul 3;13:633. doi: 10.3389/fnins.2019.00633. eCollection 2019.
8
Limbic-predominant age-related TDP-43 encephalopathy (LATE): consensus working group report.边缘系统为主的年龄相关性 TDP-43 脑病(LATE):共识工作组报告。
Brain. 2019 Jun 1;142(6):1503-1527. doi: 10.1093/brain/awz099.
9
The Major Risk Factors for Alzheimer's Disease: Age, Sex, and Genes Modulate the Microglia Response to Aβ Plaques.阿尔茨海默病的主要风险因素:年龄、性别和基因调节小胶质细胞对 Aβ 斑块的反应。
Cell Rep. 2019 Apr 23;27(4):1293-1306.e6. doi: 10.1016/j.celrep.2019.03.099.
10
Genetic Variants Associated With Neurodegenerative Diseases Regulate Gene Expression in Immune Cell CD14+ Monocytes.与神经退行性疾病相关的基因变异可调节免疫细胞CD14+单核细胞中的基因表达。
Front Genet. 2018 Dec 18;9:666. doi: 10.3389/fgene.2018.00666. eCollection 2018.