Kim MinHyeok, You Bo-Hyun, Nam Jin-Wu
Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 133791, Republic of Korea; Institute of Computer Technology, Seoul National University, Seoul 151742, Republic of Korea.
Department of Life Science, College of Natural Sciences, Hanyang University, Seoul 133791, Republic of Korea; Research Institute for Natural Sciences, Hanyang University, Seoul 133791, Republic of Korea.
Methods. 2015 Jul 15;83:111-7. doi: 10.1016/j.ymeth.2015.04.011. Epub 2015 Apr 18.
The 3' untranslated region (3' UTR) of mRNA contains elements that play regulatory roles in polyadenylation, localization, translation efficiency, and mRNA stability. Despite the significance of the 3' UTR, there is no popular method for annotating 3' UTRs and for profiling their isoforms. Recently, poly(A)-position profiling by sequencing (3P-seq) and other similar methods have successfully been used to annotate 3' UTRs; however, they contain complex RNA-biochemical experimental steps, resulting in a low yield of products. In this paper, we propose heuristic and regression methods to estimate and quantify the usage of 3' UTRs with widely profiled RNA sequencing (RNA-seq) data. With this approach, the 3' UTR usage estimated from RNA-seq was found to be highly correlated to that of 3P-seq, and poly(A) cleavage signals of 3' UTRs were detected upstream of the predicted poly(A) cleavage sites. Our methods predicted greater number of 3' UTRs than 3P-seq, which allows the profiling of the 3' UTRs of most expressed genes in diverse cell-types, stages, and species. Hence, the computational RNA-seq method for the estimation of the 3' UTR landscape would be useful as a tool for studying not only the functional roles of 3' UTR but also gene regulation by 3' UTR in a cell type-specific context. The method is implemented in open-source code, which is available at http://big.hanyang.ac.kr/GETUTR.
信使核糖核酸(mRNA)的3'非翻译区(3'UTR)包含在多聚腺苷酸化、定位、翻译效率和mRNA稳定性方面发挥调控作用的元件。尽管3'UTR具有重要意义,但目前尚无用于注释3'UTR及其异构体谱的常用方法。最近,通过测序进行的多聚腺苷酸定位分析(3P-seq)和其他类似方法已成功用于注释3'UTR;然而,它们包含复杂的RNA生化实验步骤,导致产物产量较低。在本文中,我们提出了启发式和回归方法,以利用广泛的RNA测序(RNA-seq)数据估计和量化3'UTR的使用情况。通过这种方法,发现从RNA-seq估计的3'UTR使用情况与3P-seq高度相关,并且在预测的多聚腺苷酸切割位点上游检测到了3'UTR的多聚腺苷酸切割信号。我们的方法预测的3'UTR数量比3P-seq更多,这使得能够对不同细胞类型、阶段和物种中大多数表达基因的3'UTR进行谱分析。因此,用于估计3'UTR图谱的计算RNA-seq方法不仅将作为研究3'UTR功能作用的工具有用,而且还将作为在细胞类型特异性背景下研究3'UTR介导的基因调控的工具有用。该方法以开源代码实现,可在http://big.hanyang.ac.kr/GETUTR获取。