Suppr超能文献

体外转录测序(IVT-seq)揭示了RNA测序中的极端偏差。

IVT-seq reveals extreme bias in RNA sequencing.

作者信息

Lahens Nicholas F, Kavakli Ibrahim Halil, Zhang Ray, Hayer Katharina, Black Michael B, Dueck Hannah, Pizarro Angel, Kim Junhyong, Irizarry Rafael, Thomas Russell S, Grant Gregory R, Hogenesch John B

出版信息

Genome Biol. 2014 Jun 30;15(6):R86. doi: 10.1186/gb-2014-15-6-r86.

Abstract

BACKGROUND

RNA-seq is a powerful technique for identifying and quantifying transcription and splicing events, both known and novel. However, given its recent development and the proliferation of library construction methods, understanding the bias it introduces is incomplete but critical to realizing its value.

RESULTS

We present a method, in vitro transcription sequencing (IVT-seq), for identifying and assessing the technical biases in RNA-seq library generation and sequencing at scale. We created a pool of over 1,000 in vitro transcribed RNAs from a full-length human cDNA library and sequenced them with polyA and total RNA-seq, the most common protocols. Because each cDNA is full length, and we show in vitro transcription is incredibly processive, each base in each transcript should be equivalently represented. However, with common RNA-seq applications and platforms, we find 50% of transcripts have more than two-fold and 10% have more than 10-fold differences in within-transcript sequence coverage. We also find greater than 6% of transcripts have regions of dramatically unpredictable sequencing coverage between samples, confounding accurate determination of their expression. We use a combination of experimental and computational approaches to show rRNA depletion is responsible for the most significant variability in coverage, and several sequence determinants also strongly influence representation.

CONCLUSIONS

These results show the utility of IVT-seq for promoting better understanding of bias introduced by RNA-seq. We find rRNA depletion is responsible for substantial, unappreciated biases in coverage introduced during library preparation. These biases suggest exon-level expression analysis may be inadvisable, and we recommend caution when interpreting RNA-seq results.

摘要

背景

RNA测序是一种用于识别和定量已知及新的转录和剪接事件的强大技术。然而,鉴于其近期的发展以及文库构建方法的激增,对其所引入偏差的理解并不完整,但对于实现其价值至关重要。

结果

我们提出了一种体外转录测序(IVT-seq)方法,用于大规模识别和评估RNA测序文库生成及测序过程中的技术偏差。我们从一个全长人类cDNA文库中创建了一个包含1000多个体外转录RNA的文库,并使用最常见的方案,即聚腺苷酸化RNA测序和总RNA测序对它们进行测序。由于每个cDNA都是全长的,并且我们证明体外转录具有极高的持续性,每个转录本中的每个碱基都应得到等效的呈现。然而,对于常见的RNA测序应用和平台,我们发现50%的转录本在转录本内序列覆盖度上有两倍以上的差异,10%的转录本有十倍以上的差异。我们还发现超过6%的转录本在样本间具有显著不可预测的测序覆盖区域,这混淆了对其表达的准确测定。我们结合实验和计算方法表明,核糖体RNA去除是覆盖度最大变异性的原因,并且几个序列决定因素也强烈影响呈现情况。

结论

这些结果表明IVT-seq有助于更好地理解RNA测序所引入的偏差。我们发现核糖体RNA去除是文库制备过程中引入的覆盖度方面大量未被认识到的偏差的原因。这些偏差表明外显子水平的表达分析可能不可取,并且我们建议在解释RNA测序结果时要谨慎。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验