Mendelevich Asia, Gupta Saumya, Pakharev Aleksei, Teodosiadis Athanasios, Mironov Andrey A, Gimelbrant Alexander A
Altius Institute for Biomedical Sciences, Seattle, WA, USA.
Stem Cell Program, Boston Children's Hospital, Boston, MA, USA.
bioRxiv. 2023 Feb 12:2023.02.11.528027. doi: 10.1101/2023.02.11.528027.
Analysis of allele-specific expression is strongly affected by the technical noise present in RNA-seq experiments. Previously, we showed that technical replicates can be used for precise estimates of this noise, and we provided a tool for correction of technical noise in allele-specific expression analysis. This approach is very accurate but costly due to the need for two or more replicates of each library. Here, we develop a spike-in approach that is highly accurate at only a small fraction of the cost.
We show that a distinct RNA added as a spike-in before library preparation reflects technical noise of the whole library and can be used in large batches of samples. We experimentally demonstrate the effectiveness of this approach using combinations of RNA from species distinguishable by alignment, namely, mouse, human, and . Our new approach, controlFreq , enables highly accurate and computationally efficient analysis of allele-specific expression in (and between) arbitrarily large studies at an overall cost increase of ~ 5%.
Analysis pipeline for this approach is available at GitHub as R package controlFreq ( github.com/gimelbrantlab/controlFreq ).
等位基因特异性表达分析受到RNA测序实验中技术噪声的强烈影响。此前,我们表明技术重复可用于精确估计这种噪声,并且我们提供了一种在等位基因特异性表达分析中校正技术噪声的工具。这种方法非常准确,但由于每个文库需要两个或更多重复,成本较高。在此,我们开发了一种掺入法,其准确性高且成本仅为原来的一小部分。
我们表明,在文库制备前作为掺入物添加的一种独特RNA反映了整个文库的技术噪声,并且可用于大量样本。我们通过使用可通过比对区分的物种(即小鼠、人类和……)的RNA组合,通过实验证明了这种方法的有效性。我们的新方法controlFreq能够在任意大规模研究中(以及不同研究之间)对等位基因特异性表达进行高度准确且计算高效的分析,总体成本增加约5%。
此方法的分析流程可在GitHub上作为R包controlFreq获取(github.com/gimelbrantlab/controlFreq)。