Suppr超能文献

GSMA:一种使用荟萃分析识别稳健的全局和测试基因特征的方法。

GSMA: an approach to identify robust global and test Gene Signatures using Meta-Analysis.

机构信息

Department of Computer Science, Wayne State University, Detroit, MI 48202, USA.

Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557, USA.

出版信息

Bioinformatics. 2020 Jan 15;36(2):487-495. doi: 10.1093/bioinformatics/btz561.

Abstract

MOTIVATION

Recent advances in biomedical research have made massive amount of transcriptomic data available in public repositories from different sources. Due to the heterogeneity present in the individual experiments, identifying reproducible biomarkers for a given disease from multiple independent studies has become a major challenge. The widely used meta-analysis approaches, such as Fisher's method, Stouffer's method, minP and maxP, have at least two major limitations: (i) they are sensitive to outliers, and (ii) they perform only one statistical test for each individual study, and hence do not fully utilize the potential sample size to gain statistical power.

RESULTS

Here, we propose a gene-level meta-analysis framework that overcomes these limitations and identifies a gene signature that is reliable and reproducible across multiple independent studies of a given disease. The approach provides a comprehensive global signature that can be used to understand the underlying biological phenomena, and a smaller test signature that can be used to classify future samples of a given disease. We demonstrate the utility of the framework by constructing disease signatures for influenza and Alzheimer's disease using nine datasets including 1108 individuals. These signatures are then validated on 12 independent datasets including 912 individuals. The results indicate that the proposed approach performs better than the majority of the existing meta-analysis approaches in terms of both sensitivity as well as specificity. The proposed signatures could be further used in diagnosis, prognosis and identification of therapeutic targets.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

生物医学研究的最新进展使得大量转录组数据可从不同来源的公共存储库中获得。由于个体实验中存在异质性,因此从多个独立研究中确定给定疾病的可重复生物标志物已成为一个主要挑战。广泛使用的荟萃分析方法,如 Fisher 方法、Stouffer 方法、minP 和 maxP,至少存在两个主要局限性:(i)它们对离群值敏感,(ii)它们对每个单独的研究仅进行一次统计检验,因此不能充分利用潜在的样本量来获得统计功效。

结果

在这里,我们提出了一种克服这些局限性的基因水平荟萃分析框架,该框架可识别出在给定疾病的多个独立研究中可靠且可重复的基因特征。该方法提供了一个全面的全局特征,可以用于理解潜在的生物学现象,以及一个更小的测试特征,可用于对给定疾病的未来样本进行分类。我们通过使用包括 1108 个人在内的九个数据集构建流感和阿尔茨海默病的疾病特征来证明该框架的实用性。然后,我们在包括 912 个人在内的 12 个独立数据集上验证了这些特征。结果表明,与大多数现有的荟萃分析方法相比,该方法在灵敏度和特异性方面都表现更好。所提出的特征可进一步用于诊断、预后和治疗靶点的鉴定。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

本文引用的文献

1
Redefine statistical significance.重新定义统计学显著性。
Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验