在单/多个聚腺苷酸化位点基因中识别可变聚腺苷酸化事件的方法的基准测试。
Benchmarking of methods that identify alternative polyadenylation events in single-/multiple-polyadenylation site genes.
作者信息
Tian Qiuxiang, Zou Quan, Jia Linpei
机构信息
College of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China.
School of Information Technology and Administration, Hunan University of Finance and Economics, Changsha, 410205, China.
出版信息
NAR Genom Bioinform. 2025 May 14;7(2):lqaf056. doi: 10.1093/nargab/lqaf056. eCollection 2025 Jun.
Alternative polyadenylation (APA) is a widespread post-transcriptional mechanism that diversifies gene expression by generating messenger RNA isoforms with varying 3' untranslated regions. Accurate identification and quantification of transcriptome-wide polyadenylation site (PAS) usage are essential for understanding APA-mediated gene regulation and its biological implications. In this review, we first review the landscape of computational tools developed to identify APA events from RNA sequencing (RNA-seq) data. We then benchmarked five PAS prediction tools and seven APA detection algorithms using five RNA-seq datasets derived from clear cell renal cell carcinoma (ccRCC) and adjacent normal tissues. By evaluating tool performance across genes with either single or multiple PASs, we revealed substantial variation in accuracy, sensitivity, and consistency among the tools. Based on this comparative analysis, we offer practical guidelines for tool selection and propose considerations for improving APA detection accuracy. Additionally, our analysis identified CCNL2 as a candidate gene exhibiting significant APA regulation in ccRCC, highlighting its potential as a disease-associated biomarker.
可变多聚腺苷酸化(Alternative polyadenylation,APA)是一种广泛存在的转录后机制,它通过产生具有不同3'非翻译区的信使RNA异构体来使基因表达多样化。全转录组范围内多聚腺苷酸化位点(Polyadenylation site,PAS)使用情况的准确识别和定量对于理解APA介导的基因调控及其生物学意义至关重要。在本综述中,我们首先回顾了为从RNA测序(RNA-seq)数据中识别APA事件而开发的计算工具的概况。然后,我们使用来自透明细胞肾细胞癌(Clear cell renal cell carcinoma,ccRCC)和相邻正常组织的五个RNA-seq数据集,对五种PAS预测工具和七种APA检测算法进行了基准测试。通过评估具有单个或多个PAS的基因的工具性能,我们揭示了这些工具在准确性、敏感性和一致性方面存在很大差异。基于此比较分析,我们提供了工具选择的实用指南,并提出了提高APA检测准确性的注意事项。此外,我们的分析确定CCNL2为在ccRCC中表现出显著APA调控的候选基因,突出了其作为疾病相关生物标志物的潜力。