RNA测序数据中的转录本长度偏差会混淆系统生物学。
Transcript length bias in RNA-seq data confounds systems biology.
作者信息
Oshlack Alicia, Wakefield Matthew J
机构信息
Walter and Eliza Hall Institute of Medical Research, Parkville, Vic, Australia.
出版信息
Biol Direct. 2009 Apr 16;4:14. doi: 10.1186/1745-6150-4-14.
BACKGROUND
Several recent studies have demonstrated the effectiveness of deep sequencing for transcriptome analysis (RNA-seq) in mammals. As RNA-seq becomes more affordable, whole genome transcriptional profiling is likely to become the platform of choice for species with good genomic sequences. As yet, a rigorous analysis methodology has not been developed and we are still in the stages of exploring the features of the data.
RESULTS
We investigated the effect of transcript length bias in RNA-seq data using three different published data sets. For standard analyses using aggregated tag counts for each gene, the ability to call differentially expressed genes between samples is strongly associated with the length of the transcript.
CONCLUSION
Transcript length bias for calling differentially expressed genes is a general feature of current protocols for RNA-seq technology. This has implications for the ranking of differentially expressed genes, and in particular may introduce bias in gene set testing for pathway analysis and other multi-gene systems biology analyses.
REVIEWERS
This article was reviewed by Rohan Williams (nominated by Gavin Huttley), Nicole Cloonan (nominated by Mark Ragan) and James Bullard (nominated by Sandrine Dudoit).
背景
最近的几项研究已经证明了深度测序用于哺乳动物转录组分析(RNA测序)的有效性。随着RNA测序成本的降低,全基因组转录谱分析可能会成为拥有良好基因组序列物种的首选平台。然而,尚未开发出严格的分析方法,我们仍处于探索数据特征的阶段。
结果
我们使用三个已发表的不同数据集研究了RNA测序数据中转录本长度偏差的影响。对于使用每个基因的聚合标签计数进行的标准分析,样本间差异表达基因的检测能力与转录本长度密切相关。
结论
检测差异表达基因时的转录本长度偏差是当前RNA测序技术方案的一个普遍特征。这对差异表达基因的排序有影响,尤其可能在通路分析和其他多基因系统生物学分析的基因集检测中引入偏差。
审阅人
本文由罗汉·威廉姆斯(由加文·赫特利提名)、妮可·克鲁南(由马克·拉根提名)和詹姆斯·布拉德(由桑德琳·杜多伊提名)审阅。
相似文献
Biol Direct. 2009-4-16
Bioinformatics. 2011-1-19
Evol Bioinform Online. 2013-11-13
BMC Bioinformatics. 2016-2-4
J Soc Biol. 2002
引用本文的文献
Methods Enzymol. 2025
Microorganisms. 2024-8-27
Cell Rep Methods. 2024-7-15
Methods Mol Biol. 2024
本文引用的文献
Nature. 2008-11-27
Nat Methods. 2008-7
BMC Bioinformatics. 2008-2-6
Stat Appl Genet Mol Biol. 2004