GSMA：一种使用荟萃分析识别稳健的全局和测试基因特征的方法。

GSMA: an approach to identify robust global and test Gene Signatures using Meta-Analysis.

机构信息

Department of Computer Science, Wayne State University, Detroit, MI 48202, USA.

Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557, USA.

出版信息

Bioinformatics. 2020 Jan 15;36(2):487-495. doi: 10.1093/bioinformatics/btz561.

DOI:10.1093/bioinformatics/btz561

PMID:31329248

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7869776/

Abstract

MOTIVATION

Recent advances in biomedical research have made massive amount of transcriptomic data available in public repositories from different sources. Due to the heterogeneity present in the individual experiments, identifying reproducible biomarkers for a given disease from multiple independent studies has become a major challenge. The widely used meta-analysis approaches, such as Fisher's method, Stouffer's method, minP and maxP, have at least two major limitations: (i) they are sensitive to outliers, and (ii) they perform only one statistical test for each individual study, and hence do not fully utilize the potential sample size to gain statistical power.

RESULTS

Here, we propose a gene-level meta-analysis framework that overcomes these limitations and identifies a gene signature that is reliable and reproducible across multiple independent studies of a given disease. The approach provides a comprehensive global signature that can be used to understand the underlying biological phenomena, and a smaller test signature that can be used to classify future samples of a given disease. We demonstrate the utility of the framework by constructing disease signatures for influenza and Alzheimer's disease using nine datasets including 1108 individuals. These signatures are then validated on 12 independent datasets including 912 individuals. The results indicate that the proposed approach performs better than the majority of the existing meta-analysis approaches in terms of both sensitivity as well as specificity. The proposed signatures could be further used in diagnosis, prognosis and identification of therapeutic targets.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

生物医学研究的最新进展使得大量转录组数据可从不同来源的公共存储库中获得。由于个体实验中存在异质性，因此从多个独立研究中确定给定疾病的可重复生物标志物已成为一个主要挑战。广泛使用的荟萃分析方法，如 Fisher 方法、Stouffer 方法、minP 和 maxP，至少存在两个主要局限性：（i）它们对离群值敏感，（ii）它们对每个单独的研究仅进行一次统计检验，因此不能充分利用潜在的样本量来获得统计功效。

结果

在这里，我们提出了一种克服这些局限性的基因水平荟萃分析框架，该框架可识别出在给定疾病的多个独立研究中可靠且可重复的基因特征。该方法提供了一个全面的全局特征，可以用于理解潜在的生物学现象，以及一个更小的测试特征，可用于对给定疾病的未来样本进行分类。我们通过使用包括 1108 个人在内的九个数据集构建流感和阿尔茨海默病的疾病特征来证明该框架的实用性。然后，我们在包括 912 个人在内的 12 个独立数据集上验证了这些特征。结果表明，与大多数现有的荟萃分析方法相比，该方法在灵敏度和特异性方面都表现更好。所提出的特征可进一步用于诊断、预后和治疗靶点的鉴定。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

GSMA: an approach to identify robust global and test Gene Signatures using Meta-Analysis.GSMA：一种使用荟萃分析识别稳健的全局和测试基因特征的方法。

Bioinformatics. 2020 Jan 15;36(2):487-495. doi: 10.1093/bioinformatics/btz561.

A novel bi-level meta-analysis approach: applied to biological pathway analysis.一种新型的双层次荟萃分析方法：应用于生物通路分析。

Bioinformatics. 2016 Feb 1;32(3):409-16. doi: 10.1093/bioinformatics/btv588. Epub 2015 Oct 14.

Decentralized Learning Framework of Meta-Survival Analysis for Developing Robust Prognostic Signatures.用于开发稳健预后特征的元生存分析的分散学习框架

JCO Clin Cancer Inform. 2017 Nov;1:1-13. doi: 10.1200/CCI.17.00077.

A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments.微阵列实验中检测差异表达基因的荟萃分析方法比较。

Bioinformatics. 2008 Feb 1;24(3):374-82. doi: 10.1093/bioinformatics/btm620. Epub 2008 Jan 18.

Galgo: a bi-objective evolutionary meta-heuristic identifies robust transcriptomic classifiers associated with patient outcome across multiple cancer types.Galgo：一种双目标进化元启发式算法，可识别与多种癌症患者预后相关的稳健转录组分类器。

Bioinformatics. 2020 Dec 22;36(20):5037-5044. doi: 10.1093/bioinformatics/btaa619.

Integration of Transcriptomic Data Identifies Global and Cell-Specific Asthma-Related Gene Expression Signatures.转录组数据整合识别出全局和细胞特异性哮喘相关基因表达特征。

AMIA Annu Symp Proc. 2018 Dec 5;2018:1338-1347. eCollection 2018.

Current and Potential Approaches for Defining Disease Signatures: a Systematic Review.当前和潜在的疾病特征定义方法：系统综述。

J Mol Neurosci. 2019 Apr;67(4):550-558. doi: 10.1007/s12031-019-01269-0. Epub 2019 Feb 18.

Multicohort Analysis Identifies Monocyte Gene Signatures to Accurately Monitor Subset-Specific Changes in Human Diseases.多队列分析鉴定单核细胞基因特征，以准确监测人类疾病中特定亚群的变化。

Front Immunol. 2021 May 14;12:659255. doi: 10.3389/fimmu.2021.659255. eCollection 2021.

A Multi-Cohort and Multi-Omics Meta-Analysis Framework to Identify Network-Based Gene Signatures.一种用于识别基于网络的基因特征的多队列和多组学荟萃分析框架。

Front Genet. 2019 Mar 19;10:159. doi: 10.3389/fgene.2019.00159. eCollection 2019.

Comprehensive gene expression meta-analysis of head and neck squamous cell carcinoma microarray data defines a robust survival predictor.对头颈鳞状细胞癌微阵列数据进行全面的基因表达荟萃分析，定义了一个稳健的生存预测因子。

Ann Oncol. 2014 Aug;25(8):1628-35. doi: 10.1093/annonc/mdu173. Epub 2014 May 14.

引用本文的文献

A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data.单细胞 RNA 测序数据调控网络推断方法的综合调查。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa190.

Protein structure-based gene expression signatures.基于蛋白质结构的基因表达特征。

Proc Natl Acad Sci U S A. 2021 May 11;118(19). doi: 10.1073/pnas.2014866118.

Derivation and Application of Molecular Signatures to Prostate Cancer: Opportunities and Challenges.分子特征在前列腺癌中的推导与应用：机遇与挑战

Cancers (Basel). 2021 Jan 28;13(3):495. doi: 10.3390/cancers13030495.

MAGPEL: an autoMated pipeline for inferring vAriant-driven Gene PanEls from the full-length biomedical literature.MAGPEL：从全文献中自动推断变异驱动的基因面板的自动化管道。

Sci Rep. 2020 Jul 23;10(1):12365. doi: 10.1038/s41598-020-68649-0.

R Package for Meta-Analysis of Transcriptome Data to Identify the -Regulatory Code behind the Transcriptional Reprogramming.用于转录组数据荟萃分析以识别转录重编程背后的调控代码的 R 包。

Genes (Basel). 2020 Jun 9;11(6):634. doi: 10.3390/genes11060634.

本文引用的文献

Redefine statistical significance.重新定义统计学显著性。

Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.

A Multi-Cohort and Multi-Omics Meta-Analysis Framework to Identify Network-Based Gene Signatures.一种用于识别基于网络的基因特征的多队列和多组学荟萃分析框架。

Front Genet. 2019 Mar 19;10:159. doi: 10.3389/fgene.2019.00159. eCollection 2019.

DANUBE: Data-driven meta-ANalysis using UnBiased Empirical distributions-applied to biological pathway analysis.多瑙河：使用无偏经验分布的数据驱动元分析——应用于生物途径分析

Proc IEEE Inst Electr Electron Eng. 2017 Mar;105(3):496-515. doi: 10.1109/jproc.2015.2507119. Epub 2016 Mar 31.

EMPOWERING MULTI-COHORT GENE EXPRESSION ANALYSIS TO INCREASE REPRODUCIBILITY.助力多队列基因表达分析以提高可重复性。

Pac Symp Biocomput. 2017;22:144-153. doi: 10.1142/9789813207813_0015.

Immunologic factors may play a role in herpes simplex virus 1 reactivation in the brain and retina after influenza vaccination.免疫因素可能在流感疫苗接种后大脑和视网膜中单纯疱疹病毒1型的重新激活中起作用。

IDCases. 2016 Sep 22;6:47-51. doi: 10.1016/j.idcr.2016.09.012. eCollection 2016.

Overcoming the matched-sample bottleneck: an orthogonal approach to integrate omic data.克服匹配样本瓶颈：一种整合组学数据的正交方法。

Sci Rep. 2016 Jul 12;6:29251. doi: 10.1038/srep29251.

Non-alcoholic fatty liver disease induces signs of Alzheimer's disease (AD) in wild-type mice and accelerates pathological signs of AD in an AD model.非酒精性脂肪性肝病在野生型小鼠中诱发阿尔茨海默病（AD）的症状，并在AD模型中加速AD的病理症状。

J Neuroinflammation. 2016 Jan 5;13:1. doi: 10.1186/s12974-015-0467-5.

A novel bi-level meta-analysis approach: applied to biological pathway analysis.一种新型的双层次荟萃分析方法：应用于生物通路分析。

Bioinformatics. 2016 Feb 1;32(3):409-16. doi: 10.1093/bioinformatics/btv588. Epub 2015 Oct 14.

Influenza and Bacterial Superinfection: Illuminating the Immunologic Mechanisms of Disease.流感与细菌重叠感染：阐明疾病的免疫机制

Infect Immun. 2015 Oct;83(10):3764-70. doi: 10.1128/IAI.00298-15. Epub 2015 Jul 27.

The role of endocannabinoid signaling in the molecular mechanisms of neurodegeneration in Alzheimer's disease.内源性大麻素信号在阿尔茨海默病神经退行性变分子机制中的作用

J Alzheimers Dis. 2015;43(4):1115-36. doi: 10.3233/JAD-141635.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验