Suppr超能文献

RNA测序数据的建模与分析:基于统计学视角的综述

Modeling and analysis of RNA-seq data: a review from a statistical perspective.

作者信息

Li Wei Vivian, Li Jingyi Jessica

机构信息

Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095-1554, USA.

Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095-088, USA.

出版信息

Quant Biol. 2018 Sep;6(3):195-209. doi: 10.1007/s40484-018-0144-7. Epub 2018 Aug 10.

Abstract

BACKGROUND

Since the invention of next-generation RNA sequencing (RNA-seq) technologies, they have become a powerful tool to study the presence and quantity of RNA molecules in biological samples and have revolutionized transcriptomic studies. The analysis of RNA-seq data at four different levels (samples, genes, transcripts, and exons) involve multiple statistical and computational questions, some of which remain challenging up to date.

RESULTS

We review RNA-seq analysis tools at the sample, gene, transcript, and exon levels from a statistical perspective. We also highlight the biological and statistical questions of most practical considerations.

CONCLUSIONS

The development of statistical and computational methods for analyzing RNA-seq data has made significant advances in the past decade. However, methods developed to answer the same biological question often rely on diverse statistical models and exhibit different performance under different scenarios. This review discusses and compares multiple commonly used statistical models regarding their assumptions, in the hope of helping users select appropriate methods as needed, as well as assisting developers for future method development.

摘要

背景

自从新一代RNA测序(RNA-seq)技术发明以来,它们已成为研究生物样本中RNA分子的存在情况和数量的强大工具,并彻底改变了转录组学研究。在四个不同层面(样本、基因、转录本和外显子)对RNA-seq数据进行分析涉及多个统计和计算问题,其中一些问题至今仍具有挑战性。

结果

我们从统计学角度审视了样本、基因、转录本和外显子层面的RNA-seq分析工具。我们还强调了最具实际考量的生物学和统计学问题。

结论

在过去十年中,用于分析RNA-seq数据的统计和计算方法取得了重大进展。然而,为回答相同生物学问题而开发的方法通常依赖于不同的统计模型,并且在不同场景下表现出不同的性能。本综述讨论并比较了多种常用统计模型的假设,希望能帮助用户根据需要选择合适的方法,并协助开发者进行未来的方法开发。

相似文献

1
Modeling and analysis of RNA-seq data: a review from a statistical perspective.
Quant Biol. 2018 Sep;6(3):195-209. doi: 10.1007/s40484-018-0144-7. Epub 2018 Aug 10.
2
Identifying differentially spliced genes from two groups of RNA-seq samples.
Gene. 2013 Apr 10;518(1):164-70. doi: 10.1016/j.gene.2012.11.045. Epub 2012 Dec 8.
3
MSIQ: JOINT MODELING OF MULTIPLE RNA-SEQ SAMPLES FOR ACCURATE ISOFORM QUANTIFICATION.
Ann Appl Stat. 2018 Mar;12(1):510-539. doi: 10.1214/17-AOAS1100. Epub 2018 Mar 9.
4
Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation.
Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):19867-72. doi: 10.1073/pnas.1113972108. Epub 2011 Dec 1.
5
Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data.
BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S11. doi: 10.1186/1471-2105-13-S6-S11.
7
Evaluation and comparison of computational tools for RNA-seq isoform quantification.
BMC Genomics. 2017 Aug 7;18(1):583. doi: 10.1186/s12864-017-4002-1.
8
Union Exon Based Approach for RNA-Seq Gene Quantification: To Be or Not to Be?
PLoS One. 2015 Nov 11;10(11):e0141910. doi: 10.1371/journal.pone.0141910. eCollection 2015.
9
QuickIsoSeq for Isoform Quantification in Large-Scale RNA Sequencing.
Methods Mol Biol. 2021;2284:135-145. doi: 10.1007/978-1-0716-1307-8_8.
10
A note on an exon-based strategy to identify differentially expressed genes in RNA-seq experiments.
PLoS One. 2014 Dec 26;9(12):e115964. doi: 10.1371/journal.pone.0115964. eCollection 2014.

引用本文的文献

1
Coupling quantitative systems pharmacology modelling to machine learning and artificial intelligence for drug development: its and .
Front Syst Biol. 2024 Jul 12;4:1380685. doi: 10.3389/fsysb.2024.1380685. eCollection 2024.
3
Computational methods and data resources for predicting tumor neoantigens.
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf302.
4
Improving gene isoform quantification with miniQuant.
Nat Biotechnol. 2025 Jun 3. doi: 10.1038/s41587-025-02633-9.
5
Predicting and comparing transcription start sites in single cell populations.
PLoS Comput Biol. 2025 Apr 3;21(4):e1012878. doi: 10.1371/journal.pcbi.1012878. eCollection 2025.
6
Leveraging State-of-the-Art AI Algorithms in Personalized Oncology: From Transcriptomics to Treatment.
Diagnostics (Basel). 2024 Sep 29;14(19):2174. doi: 10.3390/diagnostics14192174.
7
Evaluation of network-guided random forest for disease gene discovery.
BioData Min. 2024 Apr 16;17(1):10. doi: 10.1186/s13040-024-00361-5.
8
scFSNN: a feature selection method based on neural network for single-cell RNA-seq data.
BMC Genomics. 2024 Mar 8;25(1):264. doi: 10.1186/s12864-024-10160-1.
9
RoseAP: an analytical platform for gene function of .
Front Plant Sci. 2023 Jun 30;14:1197119. doi: 10.3389/fpls.2023.1197119. eCollection 2023.
10
Identifying Fungal Secondary Metabolites and Their Role in Plant Pathogenesis.
Methods Mol Biol. 2023;2659:193-218. doi: 10.1007/978-1-0716-3159-1_15.

本文引用的文献

1
MSIQ: JOINT MODELING OF MULTIPLE RNA-SEQ SAMPLES FOR ACCURATE ISOFORM QUANTIFICATION.
Ann Appl Stat. 2018 Mar;12(1):510-539. doi: 10.1214/17-AOAS1100. Epub 2018 Mar 9.
2
An accurate and robust imputation method scImpute for single-cell RNA-seq data.
Nat Commun. 2018 Mar 8;9(1):997. doi: 10.1038/s41467-018-03405-7.
3
The Human Cell Atlas.
Elife. 2017 Dec 5;6:e27041. doi: 10.7554/eLife.27041.
4
Quantitative RNA-seq meta-analysis of alternative exon usage in .
Genome Res. 2017 Dec;27(12):2120-2128. doi: 10.1101/gr.224626.117. Epub 2017 Oct 31.
5
Identification and visualization of differential isoform expression in RNA-seq time series.
Bioinformatics. 2018 Feb 1;34(3):524-526. doi: 10.1093/bioinformatics/btx578.
6
TROM: A Testing-Based Method for Finding Transcriptomic Similarity of Biological Samples.
Stat Biosci. 2017 Jun;9(1):105-136. doi: 10.1007/s12561-016-9163-y. Epub 2016 Aug 29.
8
Differential analysis of RNA-seq incorporating quantification uncertainty.
Nat Methods. 2017 Jul;14(7):687-690. doi: 10.1038/nmeth.4324. Epub 2017 Jun 5.
9
Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions.
Brief Bioinform. 2018 Sep 28;19(5):776-792. doi: 10.1093/bib/bbx008.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验