RNA-Seq 定量转录表达谱中精度的特征描述和改进。

Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.

机构信息

Boku University Vienna, 1190 Muthgasse 18, Vienna, Austria.

出版信息

Bioinformatics. 2011 Jul 1;27(13):i383-91. doi: 10.1093/bioinformatics/btr247.

DOI:10.1093/bioinformatics/btr247

PMID:21685096

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3117338/

Abstract

MOTIVATION

Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not. With the compilation of large-scale RNA-Seq datasets with technical replicate samples, however, we can now, for the first time, perform a systematic analysis of the precision of expression level estimates from massively parallel sequencing technology. This then allows considerations for its improvement by computational or experimental means.

RESULTS

We report on a comprehensive study of target identification and measurement precision, including their dependence on transcript expression levels, read depth and other parameters. In particular, an impressive recall of 84% of the estimated true transcript population could be achieved with 331 million 50 bp reads, with diminishing returns from longer read lengths and even less gains from increased sequencing depths. Most of the measurement power (75%) is spent on only 7% of the known transcriptome, however, making less strongly expressed transcripts harder to measure. Consequently, <30% of all transcripts could be quantified reliably with a relative error<20%. Based on established tools, we then introduce a new approach for mapping and analysing sequencing reads that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%. Extrapolations to higher sequencing depths highlight the need for efficient complementary steps. In discussion we outline possible experimental and computational strategies for further improvements in quantification precision.

CONTACT

rnaseq10@boku.ac.at

摘要

动机

测量精度决定了任何分析可靠识别显著信号的能力，例如在差异表达筛选中，无论实验设计是否包含重复样本。然而，随着大规模 RNA-Seq 数据集与技术重复样本的汇编，我们现在可以首次对大规模平行测序技术的表达水平估计的精度进行系统分析。这使得我们可以考虑通过计算或实验手段来提高精度。

结果

我们报告了一项关于目标识别和测量精度的综合研究，包括它们对转录物表达水平、读取深度和其他参数的依赖关系。特别是，用 3.31 亿个 50 个碱基对的读取可以实现估计真实转录本群体的召回率为 84%，而读取长度的增加和测序深度的增加带来的回报则递减。然而，大部分测量能力（75%）仅用于已知转录本的 7%，这使得表达水平较低的转录本更难测量。因此，只有<30%的转录本可以用相对误差<20%可靠地定量。基于已建立的工具，我们引入了一种新的方法来映射和分析测序reads，这在基因表达谱分析中显著提高了性能，将可以可靠定量的转录本数量增加到 40%以上。对更高测序深度的外推突显了对高效互补步骤的需求。在讨论中，我们概述了进一步提高定量精度的可能实验和计算策略。

联系方式

rnaseq10@boku.ac.at

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3839/3117338/c7638e708918/btr247f1.jpg

相似文献

Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.RNA-Seq 定量转录表达谱中精度的特征描述和改进。

Bioinformatics. 2011 Jul 1;27(13):i383-91. doi: 10.1093/bioinformatics/btr247.

Empirical assessment of the impact of sample number and read depth on RNA-Seq analysis workflow performance.对样本数量和读取深度对 RNA-Seq 分析工作流程性能的影响进行实证评估。

BMC Bioinformatics. 2018 Nov 14;19(1):423. doi: 10.1186/s12859-018-2445-2.

High-resolution transcriptome analysis with long-read RNA sequencing.利用长读长RNA测序进行高分辨率转录组分析。

PLoS One. 2014 Sep 24;9(9):e108095. doi: 10.1371/journal.pone.0108095. eCollection 2014.

MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples.MITIE：在多个样本中基于 RNA-Seq 的同时转录本鉴定和定量。

Bioinformatics. 2013 Oct 15;29(20):2529-38. doi: 10.1093/bioinformatics/btt442. Epub 2013 Aug 25.

AtRTD - a comprehensive reference transcript dataset resource for accurate quantification of transcript-specific expression in Arabidopsis thaliana.AtRTD——一个用于准确量化拟南芥转录本特异性表达的全面参考转录本数据集资源。

New Phytol. 2015 Oct;208(1):96-101. doi: 10.1111/nph.13545. Epub 2015 Jun 25.

SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.SPARTA：用于基于参考的细菌RNA测序转录组自动分析的简单程序。

BMC Bioinformatics. 2016 Feb 4;17:66. doi: 10.1186/s12859-016-0923-y.

Transcript Profiling Using Long-Read Sequencing Technologies.使用长读长测序技术进行转录本分析

Methods Mol Biol. 2018;1783:121-147. doi: 10.1007/978-1-4939-7834-2_6.

Polyester: simulating RNA-seq datasets with differential transcript expression.聚酯：模拟具有差异转录本表达的RNA测序数据集。

Bioinformatics. 2015 Sep 1;31(17):2778-84. doi: 10.1093/bioinformatics/btv272. Epub 2015 Apr 28.

RNA-seq differential expression studies: more sequence or more replication?RNA-seq 差异表达研究：更多的序列还是更多的重复？

Bioinformatics. 2014 Feb 1;30(3):301-4. doi: 10.1093/bioinformatics/btt688. Epub 2013 Dec 6.

Grape RNA-Seq analysis pipeline environment.葡萄 RNA-Seq 分析管道环境。

Bioinformatics. 2013 Mar 1;29(5):614-21. doi: 10.1093/bioinformatics/btt016. Epub 2013 Jan 17.

引用本文的文献

The Evolution of Next-Generation Sequencing Technologies.下一代测序技术的演进。

Methods Mol Biol. 2025;2866:3-29. doi: 10.1007/978-1-0716-4192-7_1.

Genomic reproducibility in the bioinformatics era.生物信息学时代的基因组可重复性。

Genome Biol. 2024 Aug 9;25(1):213. doi: 10.1186/s13059-024-03343-2.

3' RNA-seq is superior to standard RNA-seq in cases of sparse data but inferior at identifying toxicity pathways in a model organism.在数据稀疏的情况下，3' RNA测序比标准RNA测序更具优势，但在识别模式生物中的毒性途径方面则较差。

Front Bioinform. 2023 Jul 27;3:1234218. doi: 10.3389/fbinf.2023.1234218. eCollection 2023.

Transcriptomics for Clinical and Experimental Biology Research: Hang on a Seq.临床与实验生物学研究中的转录组学：紧跟测序技术发展

Adv Genet (Hoboken). 2023 Jan 17;4(2):2200024. doi: 10.1002/ggn2.202200024. eCollection 2023 Jun.

Modeling and cleaning RNA-seq data significantly improve detection of differentially expressed genes.对RNA测序数据进行建模和清理可显著提高差异表达基因的检测能力。

BMC Bioinformatics. 2022 Nov 16;23(1):488. doi: 10.1186/s12859-022-05023-z.

Effect of chronic intermittent ethanol vapor exposure on RNA content of brain-derived extracellular vesicles.慢性间歇性乙醇蒸气暴露对脑源性细胞外囊泡 RNA 含量的影响。

Alcohol. 2022 Dec;105:9-24. doi: 10.1016/j.alcohol.2022.08.006. Epub 2022 Aug 30.

Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data.从 RNA-seq 数据构建基因共表达网络的稳健归一化和转换技术。

Genome Biol. 2022 Jan 3;23(1):1. doi: 10.1186/s13059-021-02568-9.

An integrated analysis of mRNAs, lncRNAs, and miRNAs based on weighted gene co-expression network analysis involved in bovine endometritis.基于加权基因共表达网络分析的牛子宫内膜炎中 mRNAs、lncRNAs 和 miRNAs 的综合分析。

Sci Rep. 2021 Sep 10;11(1):18050. doi: 10.1038/s41598-021-97319-y.

Long Non-coding RNA Expression Profiling Using Arraystar LncRNA Microarrays.使用 Arraystar LncRNA 微阵列进行长非编码 RNA 表达谱分析。

Methods Mol Biol. 2021;2372:53-74. doi: 10.1007/978-1-0716-1697-0_7.

Comparative Transcriptome Analysis of Milk Somatic Cells During Lactation Between Two Intensively Reared Dairy Sheep Breeds.两个集约化养殖奶羊品种泌乳期乳体细胞的比较转录组分析

Front Genet. 2021 Jul 19;12:700489. doi: 10.3389/fgene.2021.700489. eCollection 2021.

本文引用的文献

Statistical Analyses of Next Generation Sequence Data: A Partial Overview.下一代测序数据的统计分析：部分概述

J Proteomics Bioinform. 2010 Jun 1;3(6):183-190. doi: 10.4172/jpb.1000138.

Accurate quantification of transcriptome from RNA-Seq data by effective length normalization.通过有效长度归一化对 RNA-Seq 数据进行转录组的精确定量。

Nucleic Acids Res. 2011 Jan;39(2):e9. doi: 10.1093/nar/gkq1015. Epub 2010 Nov 8.

Differential expression analysis for sequence count data.差异表达分析序列计数数据。

Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.

Alternative expression analysis by RNA sequencing.RNA 测序的替代表达分析。

Nat Methods. 2010 Oct;7(10):843-7. doi: 10.1038/nmeth.1503. Epub 2010 Sep 12.

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.通过 RNA-Seq 进行转录本组装和定量分析揭示了细胞分化过程中未注释的转录本和异构体转换。

Nat Biotechnol. 2010 May;28(5):511-5. doi: 10.1038/nbt.1621. Epub 2010 May 2.

Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.mRNA-Seq 实验中标准化和差异表达的统计方法评估。

BMC Bioinformatics. 2010 Feb 18;11:94. doi: 10.1186/1471-2105-11-94.

BFAST: an alignment tool for large scale genome resequencing.BFAST：用于大规模基因组重测序的比对工具。

PLoS One. 2009 Nov 11;4(11):e7767. doi: 10.1371/journal.pone.0007767.

Ensembl's 10th year.Ensembl 的第十个年头。

Nucleic Acids Res. 2010 Jan;38(Database issue):D557-62. doi: 10.1093/nar/gkp972. Epub 2009 Nov 11.

Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts.针对癌症转录组的靶向下一代测序可提高序列变异和新型融合转录本的检测率。

Genome Biol. 2009;10(10):R115. doi: 10.1186/gb-2009-10-10-r115. Epub 2009 Oct 16.

Comparison of next generation sequencing technologies for transcriptome characterization.用于转录组特征分析的新一代测序技术比较

BMC Genomics. 2009 Aug 1;10:347. doi: 10.1186/1471-2164-10-347.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

RNA-Seq 定量转录表达谱中精度的特征描述和改进。

Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.

机构信息

出版信息

MOTIVATION

RESULTS

CONTACT

动机

结果

联系方式

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献