Suppr
超能文献

使用通量模拟器对通用 RNA-Seq 实验进行建模和模拟。

Modelling and simulating generic RNA-Seq experiments with the flux simulator.

机构信息

Bioinformatics and Genomics Program, Centre de Regulació Genòmica (CRG), 08003 Barcelona, Spain.

出版信息

Nucleic Acids Res. 2012 Nov 1;40(20):10073-83. doi: 10.1093/nar/gks666. Epub 2012 Sep 7.

DOI:10.1093/nar/gks666

PMID:22962361

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3488205/

Abstract

High-throughput sequencing of cDNA libraries constructed from cellular RNA complements (RNA-Seq) naturally provides a digital quantitative measurement for every expressed RNA molecule. Nature, impact and mutual interference of biases in different experimental setups are, however, still poorly understood-mostly due to the lack of data from intermediate protocol steps. We analysed multiple RNA-Seq experiments, involving different sample preparation protocols and sequencing platforms: we broke them down into their common--and currently indispensable--technical components (reverse transcription, fragmentation, adapter ligation, PCR amplification, gel segregation and sequencing), investigating how such different steps influence abundance and distribution of the sequenced reads. For each of those steps, we developed universally applicable models, which can be parameterised by empirical attributes of any experimental protocol. Our models are implemented in a computer simulation pipeline called the Flux Simulator, and we show that read distributions generated by different combinations of these models reproduce well corresponding evidence obtained from the corresponding experimental setups. We further demonstrate that our in silico RNA-Seq provides insights about hidden precursors that determine the final configuration of reads along gene bodies; enhancing or compensatory effects that explain apparently controversial observations can be observed. Moreover, our simulations identify hitherto unreported sources of systematic bias from RNA hydrolysis, a fragmentation technique currently employed by most RNA-Seq protocols.

摘要

从细胞 RNA 互补物（RNA-Seq）构建的 cDNA 文库的高通量测序自然为每个表达 RNA 分子提供了数字定量测量。然而，不同实验设置中偏差的影响、性质和相互干扰仍然了解甚少——主要是由于缺乏中间协议步骤的数据。我们分析了多个 RNA-Seq 实验，涉及不同的样品制备方案和测序平台：我们将它们分解为常见的——目前不可或缺的——技术组件（反转录、片段化、接头连接、PCR 扩增、凝胶分离和测序），研究这些不同步骤如何影响测序reads 的丰度和分布。对于这些步骤中的每一个，我们开发了通用的模型，这些模型可以通过任何实验方案的经验属性进行参数化。我们的模型在称为通量模拟器的计算机模拟管道中实现，我们表明，由这些模型的不同组合生成的读取分布很好地再现了从相应实验设置获得的相应证据。我们进一步证明，我们的 RNA-Seq 可以深入了解隐藏的前体，这些前体决定了基因体中读取的最终配置；可以观察到增强或补偿效应，这些效应可以解释明显有争议的观察结果。此外，我们的模拟确定了迄今未报道的系统偏差源，这些源来自 RNA 水解，这是目前大多数 RNA-Seq 方案中采用的一种片段化技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1da3/3488205/c845303102fc/gks666f1p.jpg

相似文献

Modelling and simulating generic RNA-Seq experiments with the flux simulator.

Nucleic Acids Res. 2012 Nov 1;40(20):10073-83. doi: 10.1093/nar/gks666. Epub 2012 Sep 7.

High-Throughput Cellular RNA Sequencing (HiCAR-Seq): Cost-Effective, High-Throughput 3' mRNA-Seq Method Enabling Individual Sample Quality Control.

Curr Protoc Mol Biol. 2020 Sep;132(1):e123. doi: 10.1002/cpmb.123.

RNA sequencing and quantitation using the Helicos Genetic Analysis System.

Methods Mol Biol. 2011;733:37-49. doi: 10.1007/978-1-61779-089-8_3.

Simulating multiple faceted variability in single cell RNA sequencing.

Nat Commun. 2019 Jun 13;10(1):2611. doi: 10.1038/s41467-019-10500-w.

Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

BMC Bioinformatics. 2015 Oct 16;16:332. doi: 10.1186/s12859-015-0750-6.

Quality control of RNA-seq experiments.

Methods Mol Biol. 2015;1269:137-46. doi: 10.1007/978-1-4939-2291-8_8.

Evaluation of the coverage and depth of transcriptome by RNA-Seq in chickens.

BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2105-12-S10-S5.

Local sequence and sequencing depth dependent accuracy of RNA-seq reads.

BMC Bioinformatics. 2017 Aug 9;18(1):364. doi: 10.1186/s12859-017-1780-z.

A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments.

BMC Bioinformatics. 2017 Mar 14;18(Suppl 3):43. doi: 10.1186/s12859-017-1471-9.

CSEQ-SIMULATOR: A DATA SIMULATOR FOR CLIP-SEQ EXPERIMENTS.

Pac Symp Biocomput. 2016;21:433-44.

引用本文的文献

Selecting differential splicing methods: Practical considerations for short-read RNA sequencing.

F1000Res. 2025 May 30;14:47. doi: 10.12688/f1000research.155223.2. eCollection 2025.

Studying relative RNA localization from nucleus to the cytosol.

NAR Genom Bioinform. 2025 Jun 20;7(2):lqaf032. doi: 10.1093/nargab/lqaf032. eCollection 2025 Jun.

VirDiG: a transcriptome assembler for coronavirus.

Bioinform Adv. 2025 Apr 8;5(1):vbaf075. doi: 10.1093/bioadv/vbaf075. eCollection 2025.

Cov-trans: an efficient algorithm for discontinuous transcript assembly in coronaviruses.

BMC Genomics. 2024 Dec 30;25(1):1257. doi: 10.1186/s12864-024-11179-0.

BEERS2: RNA-Seq simulation through high fidelity in silico modeling.

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae164.

Studying relative RNA localization From nucleus to the cytosol.

bioRxiv. 2024 Mar 11:2024.03.06.583744. doi: 10.1101/2024.03.06.583744.

Challenges and best practices in omics benchmarking.

Nat Rev Genet. 2024 May;25(5):326-339. doi: 10.1038/s41576-023-00679-6. Epub 2024 Jan 12.

On Bridging Paired-end RNA-seq Data.

ACM BCB. 2023 Sep;2023. doi: 10.1145/3584371.3612987. Epub 2023 Oct 4.

TRS: a method for determining transcript termini from RNAtag-seq sequencing data.

Nat Commun. 2023 Nov 29;14(1):7843. doi: 10.1038/s41467-023-43534-2.

A safety framework for flow decomposition problems via integer linear programming.

Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad640.

本文引用的文献

An integrated semiconductor device enabling non-optical genome sequencing.

Nature. 2011 Jul 20;475(7356):348-52. doi: 10.1038/nature10242.

Barcoding bias in high-throughput multiplex sequencing of miRNA.

Genome Res. 2011 Sep;21(9):1506-11. doi: 10.1101/gr.121715.111. Epub 2011 Jul 12.

Improving RNA-Seq expression estimates by correcting for fragment bias.

Genome Biol. 2011;12(3):R22. doi: 10.1186/gb-2011-12-3-r22. Epub 2011 Mar 16.

Detection and removal of biases in the analysis of next-generation sequencing reads.

PLoS One. 2011 Jan 31;6(1):e16685. doi: 10.1371/journal.pone.0016685.

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Nat Biotechnol. 2010 May;28(5):511-5. doi: 10.1038/nbt.1621. Epub 2010 May 2.

Biases in Illumina transcriptome sequencing caused by random hexamer priming.

Nucleic Acids Res. 2010 Jul;38(12):e131. doi: 10.1093/nar/gkq224. Epub 2010 Apr 14.

A scalable, fully automated process for construction of sequence-ready barcoded libraries for 454.

Genome Biol. 2010;11(2):R15. doi: 10.1186/gb-2010-11-2-r15. Epub 2010 Feb 5.

FRT-seq: amplification-free, strand-specific transcriptome sequencing.

Nat Methods. 2010 Feb;7(2):130-2. doi: 10.1038/nmeth.1417. Epub 2010 Jan 17.

Genome-wide identification of alternative splice forms down-regulated by nonsense-mediated mRNA decay in Drosophila.

PLoS Genet. 2009 Jun;5(6):e1000525. doi: 10.1371/journal.pgen.1000525. Epub 2009 Jun 19.

Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling.

Science. 2009 Apr 10;324(5924):218-23. doi: 10.1126/science.1168978. Epub 2009 Feb 12.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

使用通量模拟器对通用 RNA-Seq 实验进行建模和模拟。

Modelling and simulating generic RNA-Seq experiments with the flux simulator.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译