Suppr
超能文献

RNA测序数据的基因集分析方法：性能评估与应用指南

Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline.

作者信息

Rahmatallah Yasir, Emmert-Streib Frank, Glazko Galina

出版信息

Brief Bioinform. 2016 May;17(3):393-407. doi: 10.1093/bib/bbv069. Epub 2015 Sep 4.

DOI:10.1093/bib/bbv069

PMID:26342128

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4870397/

Abstract

Transcriptome sequencing (RNA-seq) is gradually replacing microarrays for high-throughput studies of gene expression. The main challenge of analyzing microarray data is not in finding differentially expressed genes, but in gaining insights into the biological processes underlying phenotypic differences. To interpret experimental results from microarrays, gene set analysis (GSA) has become the method of choice, in particular because it incorporates pre-existing biological knowledge (in a form of functionally related gene sets) into the analysis. Here we provide a brief review of several statistically different GSA approaches (competitive and self-contained) that can be adapted from microarrays practice as well as those specifically designed for RNA-seq. We evaluate their performance (in terms of Type I error rate, power, robustness to the sample size and heterogeneity, as well as the sensitivity to different types of selection biases) on simulated and real RNA-seq data. Not surprisingly, the performance of various GSA approaches depends only on the statistical hypothesis they test and does not depend on whether the test was developed for microarrays or RNA-seq data. Interestingly, we found that competitive methods have lower power as well as robustness to the samples heterogeneity than self-contained methods, leading to poor results reproducibility. We also found that the power of unsupervised competitive methods depends on the balance between up- and down-regulated genes in tested gene sets. These properties of competitive methods have been overlooked before. Our evaluation provides a concise guideline for selecting GSA approaches, best performing under particular experimental settings in the context of RNA-seq.

摘要

转录组测序（RNA-seq）正逐渐取代微阵列用于基因表达的高通量研究。分析微阵列数据的主要挑战不在于寻找差异表达基因，而在于深入了解表型差异背后的生物学过程。为了解释微阵列实验结果，基因集分析（GSA）已成为首选方法，特别是因为它将预先存在的生物学知识（以功能相关基因集的形式）纳入分析。在这里，我们简要回顾几种统计上不同的GSA方法（竞争性和自含式），这些方法既可以从微阵列实践中改编而来，也有专门为RNA-seq设计的。我们在模拟和真实的RNA-seq数据上评估它们的性能（根据I型错误率、功效、对样本大小和异质性的稳健性以及对不同类型选择偏差的敏感性）。不出所料，各种GSA方法的性能仅取决于它们所检验的统计假设，而不取决于该检验是为微阵列数据还是RNA-seq数据开发的。有趣的是，我们发现竞争性方法相对于自含式方法具有更低的功效以及对样本异质性的稳健性，导致结果重现性较差。我们还发现无监督竞争性方法的功效取决于测试基因集中上调和下调基因之间的平衡。竞争性方法的这些特性以前被忽视了。我们的评估为在RNA-seq背景下特定实验设置下选择表现最佳的GSA方法提供了简明指南。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/18e7/4870397/b102d8222c9c/bbv069f1p.jpg

相似文献

Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline.

Brief Bioinform. 2016 May;17(3):393-407. doi: 10.1093/bib/bbv069. Epub 2015 Sep 4.

Comparative evaluation of gene set analysis approaches for RNA-Seq data.

BMC Bioinformatics. 2014 Dec 5;15(1):397. doi: 10.1186/s12859-014-0397-8.

Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond.

Methods Mol Biol. 2017;1613:125-159. doi: 10.1007/978-1-4939-7027-8_7.

Robust identification of differentially expressed genes from RNA-seq data.

Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.

Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.

BMC Bioinformatics. 2016 Mar 31;17:146. doi: 10.1186/s12859-016-0994-9.

Detection of high variability in gene expression from single-cell RNA-seq profiling.

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):508. doi: 10.1186/s12864-016-2897-6.

RNA-seq differential expression studies: more sequence or more replication?

Bioinformatics. 2014 Feb 1;30(3):301-4. doi: 10.1093/bioinformatics/btt688. Epub 2013 Dec 6.

Optimal alpha reduces error rates in gene expression studies: a meta-analysis approach.

BMC Bioinformatics. 2017 Jun 21;18(1):312. doi: 10.1186/s12859-017-1728-3.

Correlation between RNA-Seq and microarrays results using TCGA data.

Gene. 2017 Sep 10;628:200-204. doi: 10.1016/j.gene.2017.07.056. Epub 2017 Jul 20.

LPEseq: Local-Pooled-Error Test for RNA Sequencing Experiments with a Small Number of Replicates.

PLoS One. 2016 Aug 17;11(8):e0159182. doi: 10.1371/journal.pone.0159182. eCollection 2016.

引用本文的文献

Single-cell sequencing reveals that AK5 inhibits apoptosis in AD oligodendrocytes by regulating the AMPK signaling pathway.

Mol Biol Rep. 2025 Feb 8;52(1):213. doi: 10.1007/s11033-025-10311-x.

Comparative Analysis of Immune Response Genes Induced by a Virulent or Attenuated Strain of .

Int J Mol Sci. 2025 Jan 8;26(2):487. doi: 10.3390/ijms26020487.

Improving data interpretability with new differential sample variance gene set tests.

Res Sq. 2024 Sep 9:rs.3.rs-4888767. doi: 10.21203/rs.3.rs-4888767/v1.

Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data.

Genomics Proteomics Bioinformatics. 2024 Jul 3;22(2). doi: 10.1093/gpbjnl/qzae014.

Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks.

PLoS One. 2024 May 16;19(5):e0302696. doi: 10.1371/journal.pone.0302696. eCollection 2024.

Integrative analysis of single-cell and bulk transcriptome data reveal the significant role of macrophages in lupus nephritis.

Arthritis Res Ther. 2024 Apr 12;26(1):84. doi: 10.1186/s13075-024-03311-y.

Transcriptome Analysis of Compensatory Growth and Meat Quality Alteration after Varied Restricted Feeding Conditions in Beef Cattle.

Int J Mol Sci. 2024 Feb 26;25(5):2704. doi: 10.3390/ijms25052704.

Roastgsa: a comparison of rotation-based scores for gene set enrichment analysis.

BMC Bioinformatics. 2023 Oct 30;24(1):408. doi: 10.1186/s12859-023-05510-x.

Integrative pathway and network analysis provide insights on flooding-tolerance genes in soybean.

Sci Rep. 2023 Feb 3;13(1):1980. doi: 10.1038/s41598-023-28593-1.

The hitchhikers' guide to RNA sequencing and functional analysis.

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac529.

本文引用的文献

LncRNA2Function: a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data.

BMC Genomics. 2015;16 Suppl 3(Suppl 3):S2. doi: 10.1186/1471-2164-16-S3-S2. Epub 2015 Jan 29.

An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era.

Genome Biol. 2014 Dec 3;15(12):523. doi: 10.1186/s13059-014-0523-y.

limma powers differential expression analyses for RNA-sequencing and microarray studies.

Nucleic Acids Res. 2015 Apr 20;43(7):e47. doi: 10.1093/nar/gkv007. Epub 2015 Jan 20.

Comparative evaluation of gene set analysis approaches for RNA-Seq data.

BMC Bioinformatics. 2014 Dec 5;15(1):397. doi: 10.1186/s12859-014-0397-8.

SeqGSEA: a Bioconductor package for gene set enrichment analysis of RNA-Seq data integrating differential expression and splicing.

Bioinformatics. 2014 Jun 15;30(12):1777-9. doi: 10.1093/bioinformatics/btu090. Epub 2014 Feb 17.

voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.

Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.

A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity.

PLoS One. 2013 Nov 15;8(11):e79217. doi: 10.1371/journal.pone.0079217. eCollection 2013.

Soft truncation thresholding for gene set analysis of RNA-seq data: application to a vaccine study.

Sci Rep. 2013 Oct 9;3:2898. doi: 10.1038/srep02898.

Gene set enrichment analysis of RNA-Seq data: integrating differential expression and splicing.

BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S16. doi: 10.1186/1471-2105-14-S5-S16. Epub 2013 Apr 10.

Gene set analysis methods: statistical models and methodological differences.

Brief Bioinform. 2014 Jul;15(4):504-18. doi: 10.1093/bib/bbt002.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

RNA测序数据的基因集分析方法：性能评估与应用指南

Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline.

作者信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译