Suppr超能文献

环状 RNA 从单端 RNA-seq 数据检测方法的评估。

Evaluation of methods to detect circular RNAs from single-end RNA-sequencing data.

机构信息

Information Technology Institute, Vietnam National University in Hanoi, Hanoi, Vietnam.

University of Engineering and Technology, Vietnam National University in Hanoi, Hanoi, Vietnam.

出版信息

BMC Genomics. 2022 Feb 8;23(1):106. doi: 10.1186/s12864-022-08329-7.

Abstract

BACKGROUND

Circular RNA (circRNA), a class of RNA molecule with a loop structure, has recently attracted researchers due to its diverse biological functions and potential biomarkers of human diseases. Most of the current circRNA detection methods from RNA-sequencing (RNA-Seq) data utilize the mapping information of paired-end (PE) reads to eliminate false positives. However, much of the practical RNA-Seq data such as cross-linking immunoprecipitation sequencing (CLIP-Seq) data usually contain single-end (SE) reads. It is not clear how well these tools perform on SE RNA-Seq data.

RESULTS

In this study, we present a systematic evaluation of six advanced RNA-based methods and two CLIP-Seq based methods for detecting circRNAs from SE RNA-Seq data. The performances of the methods are rigorously assessed based on precision, sensitivity, F1 score, and true discovery rate. We investigate the impacts of read length, false positive ratio, sequencing depth and PE mapping information on the performances of the methods using simulated SE RNA-Seq simulated datasets. The real datasets used in this study consist of four experimental RNA-Seq datasets with ≥100bp read length and 124 CLIP-Seq samples from 45 studies that contain mostly short-read (≤50bp) RNA-Seq data. The simulation study shows that the sensitivities of most of the methods can be improved by increasing either read length or sequencing depth, and that the levels of false positive rates significantly affect the precision of all methods. Furthermore, the PE mapping information can improve the method's precision but can not always guarantee the increase of F1 score. Overall, no method is dominant for all SE RNA-Seq data. The RNA-based methods perform better for the long-read datasets but are worse for the short-read datasets. In contrast, the CLIP-Seq based methods outperform the RNA-Seq based methods for all the short-read samples. Combining the results of these methods can significantly improve precision in the CLIP-Seq data.

CONCLUSIONS

The results provide a systematic evaluation of circRNA detection methods on SE RNA-Seq data that would facilitate researchers' strategies in circRNA analysis.

摘要

背景

环状 RNA(circRNA)是一类具有环状结构的 RNA 分子,由于其具有多种生物学功能,并且可能成为人类疾病的生物标志物,因此最近引起了研究人员的关注。目前,大多数基于 RNA 测序(RNA-Seq)数据的 circRNA 检测方法都利用了配对末端(PE)reads 的映射信息来消除假阳性。然而,许多实际的 RNA-Seq 数据,如交联免疫沉淀测序(CLIP-Seq)数据,通常包含单端(SE)reads。目前尚不清楚这些工具在 SE RNA-Seq 数据上的性能如何。

结果

本研究系统评估了六种先进的基于 RNA 的方法和两种基于 CLIP-Seq 的方法,用于从 SE RNA-Seq 数据中检测 circRNA。基于精确率、灵敏度、F1 分数和真发现率,对方法的性能进行了严格评估。我们使用模拟的 SE RNA-Seq 模拟数据集,研究了读长、假阳性率、测序深度和 PE 映射信息对方法性能的影响。本研究中使用的真实数据集包括四个实验 RNA-Seq 数据集,这些数据集的读长均≥100bp,以及来自 45 项研究的 124 个 CLIP-Seq 样本,这些研究大多包含短读(≤50bp)RNA-Seq 数据。模拟研究表明,通过增加读长或测序深度,大多数方法的灵敏度都可以提高,并且假阳性率的水平显著影响所有方法的精确率。此外,PE 映射信息可以提高方法的精确率,但并不总是能保证 F1 分数的增加。总体而言,没有一种方法在所有 SE RNA-Seq 数据上都具有优势。基于 RNA 的方法在长读数据集上表现更好,但在短读数据集上表现更差。相比之下,基于 CLIP-Seq 的方法在所有短读样本上的表现均优于基于 RNA 的方法。结合这些方法的结果可以显著提高 CLIP-Seq 数据的精确率。

结论

本研究结果对 SE RNA-Seq 数据上的 circRNA 检测方法进行了系统评估,有助于研究人员在 circRNA 分析中制定策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d37/8822704/bf4a745a8a4f/12864_2022_8329_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验