RNA-Seq 数据中短读率非均匀性建模。

Modeling non-uniformity in short-read rates in RNA-Seq data.

机构信息

Department of Statistics, Stanford University, Sequoia Hall, 390 Serra Mall, Stanford, CA 94305, USA.

出版信息

Genome Biol. 2010;11(5):R50. doi: 10.1186/gb-2010-11-5-r50. Epub 2010 May 11.

DOI:10.1186/gb-2010-11-5-r50

PMID:20459815

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2898062/

Abstract

After mapping, RNA-Seq data can be summarized by a sequence of read counts commonly modeled as Poisson variables with constant rates along each transcript, which actually fit data poorly. We suggest using variable rates for different positions, and propose two models to predict these rates based on local sequences. These models explain more than 50% of the variations and can lead to improved estimates of gene and isoform expressions for both Illumina and Applied Biosystems data.

摘要

经过映射后，RNA-Seq 数据可以通过一系列读取计数进行总结，通常这些计数可以建模为泊松变量，在每个转录本上具有恒定的速率，但实际上这些模型很难准确拟合数据。我们建议对不同位置使用可变速率，并提出了两种基于局部序列预测这些速率的模型。这些模型可以解释超过 50%的变异，并可以提高对 Illumina 和 Applied Biosystems 数据的基因和异构体表达的估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a1b/2898062/3d833f0df158/gb-2010-11-5-r50-1.jpg

相似文献

Modeling non-uniformity in short-read rates in RNA-Seq data.

Genome Biol. 2010;11(5):R50. doi: 10.1186/gb-2010-11-5-r50. Epub 2010 May 11.

Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

BMC Bioinformatics. 2015 Oct 16;16:332. doi: 10.1186/s12859-015-0750-6.

Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation.

Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):19867-72. doi: 10.1073/pnas.1113972108. Epub 2011 Dec 1.

Towards reliable isoform quantification using RNA-SEQ data.

BMC Bioinformatics. 2010 Apr 29;11 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2105-11-S3-S6.

WemIQ: an accurate and robust isoform quantification method for RNA-seq data.

Bioinformatics. 2015 Mar 15;31(6):878-85. doi: 10.1093/bioinformatics/btu757. Epub 2014 Nov 17.

Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq.

Bioinformatics. 2011 Feb 15;27(4):502-8. doi: 10.1093/bioinformatics/btq696. Epub 2010 Dec 17.

Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq.

J Bioinform Comput Biol. 2010 Dec;8 Suppl 1:177-92. doi: 10.1142/s0219720010005178.

PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution.

Nucleic Acids Res. 2014 Feb;42(3):e20. doi: 10.1093/nar/gkt1304. Epub 2013 Dec 20.

EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering.

BMC Bioinformatics. 2015 Sep 3;16:278. doi: 10.1186/s12859-015-0704-z.

A two-parameter generalized Poisson model to improve the analysis of RNA-seq data.

Nucleic Acids Res. 2010 Sep;38(17):e170. doi: 10.1093/nar/gkq670. Epub 2010 Jul 29.

引用本文的文献

A duplex sequencing approach for high-sensitivity detection of genome-edited plants.

Food Chem (Oxf). 2025 Jul 17;11:100278. doi: 10.1016/j.fochms.2025.100278. eCollection 2025 Dec.

Sources of non-uniform coverage in short-read RNA-Seq data.

bioRxiv. 2025 Feb 6:2025.01.30.634337. doi: 10.1101/2025.01.30.634337.

Enhancing RNA-seq analysis by addressing all co-existing biases using a self-benchmarking approach with 2D structural insights.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae532.

Enhancing RNA-seq bias mitigation with the Gaussian self-benchmarking framework: towards unbiased sequencing data.

BMC Genomics. 2024 Sep 30;25(1):904. doi: 10.1186/s12864-024-10814-0.

Artifacts and biases of the reverse transcription reaction in RNA sequencing.

RNA. 2023 Jul;29(7):889-897. doi: 10.1261/rna.079623.123. Epub 2023 Mar 29.

Nanopore microscope identifies RNA isoforms with structural colours.

Nat Chem. 2022 Nov;14(11):1258-1264. doi: 10.1038/s41557-022-01037-5. Epub 2022 Sep 19.

Polee: RNA-Seq analysis using approximate likelihood.

NAR Genom Bioinform. 2021 May 25;3(2):lqab046. doi: 10.1093/nargab/lqab046. eCollection 2021 Jun.

Multiple freeze-thaw cycles lead to a loss of consistency in poly(A)-enriched RNA sequencing.

BMC Genomics. 2021 Jan 21;22(1):69. doi: 10.1186/s12864-021-07381-z.

AIDE: annotation-assisted isoform discovery with high precision.

Genome Res. 2019 Dec;29(12):2056-2072. doi: 10.1101/gr.251108.119. Epub 2019 Nov 6.

Modelling RNA-Seq data with a zero-inflated mixture Poisson linear model.

Genet Epidemiol. 2019 Oct;43(7):786-799. doi: 10.1002/gepi.22246. Epub 2019 Jul 22.

本文引用的文献

Biases in Illumina transcriptome sequencing caused by random hexamer priming.

Nucleic Acids Res. 2010 Jul;38(12):e131. doi: 10.1093/nar/gkq224. Epub 2010 Apr 14.

Statistical inferences for isoform expression in RNA-Seq.

Bioinformatics. 2009 Apr 15;25(8):1026-32. doi: 10.1093/bioinformatics/btp113. Epub 2009 Feb 25.

RNA-Seq: a revolutionary tool for transcriptomics.

Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484.

Cross-hybridization modeling on Affymetrix exon arrays.

Bioinformatics. 2008 Dec 15;24(24):2887-93. doi: 10.1093/bioinformatics/btn571. Epub 2008 Nov 4.

Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing.

Nat Genet. 2008 Dec;40(12):1413-5. doi: 10.1038/ng.259. Epub 2008 Nov 2.

An integrated software system for analyzing ChIP-chip and ChIP-seq data.

Nat Biotechnol. 2008 Nov;26(11):1293-300. doi: 10.1038/nbt.1505. Epub 2008 Nov 2.

Alternative isoform regulation in human tissue transcriptomes.

Nature. 2008 Nov 27;456(7221):470-6. doi: 10.1038/nature07509.

Probe signal correction for differential methylation hybridization experiments.

BMC Bioinformatics. 2008 Oct 23;9:453. doi: 10.1186/1471-2105-9-453.

Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applications.

Bioinformatics. 2008 Dec 1;24(23):2776-7. doi: 10.1093/bioinformatics/btn512. Epub 2008 Oct 7.

SeqMap: mapping massive amount of oligonucleotides to the genome.

Bioinformatics. 2008 Oct 15;24(20):2395-6. doi: 10.1093/bioinformatics/btn429. Epub 2008 Aug 12.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

RNA-Seq 数据中短读率非均匀性建模。

Modeling non-uniformity in short-read rates in RNA-Seq data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献