• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于从RNA测序数据中确定和定量全长mRNA剪接形式的算法的基准分析。

Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data.

作者信息

Hayer Katharina E, Pizarro Angel, Lahens Nicholas F, Hogenesch John B, Grant Gregory R

机构信息

University of Pennsylvania, Institute for Translational Medicine and Therapeutics, Philadelphia, PA 19104.

Scientific Computing at Amazon Web Services, Seattle, WA 98108.

出版信息

Bioinformatics. 2015 Dec 15;31(24):3938-45. doi: 10.1093/bioinformatics/btv488. Epub 2015 Sep 3.

DOI:10.1093/bioinformatics/btv488
PMID:26338770
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4673975/
Abstract

MOTIVATION

Because of the advantages of RNA sequencing (RNA-Seq) over microarrays, it is gaining widespread popularity for highly parallel gene expression analysis. For example, RNA-Seq is expected to be able to provide accurate identification and quantification of full-length splice forms. A number of informatics packages have been developed for this purpose, but short reads make it a difficult problem in principle. Sequencing error and polymorphisms add further complications. It has become necessary to perform studies to determine which algorithms perform best and which if any algorithms perform adequately. However, there is a dearth of independent and unbiased benchmarking studies. Here we take an approach using both simulated and experimental benchmark data to evaluate their accuracy.

RESULTS

We conclude that most methods are inaccurate even using idealized data, and that no method is highly accurate once multiple splice forms, polymorphisms, intron signal, sequencing errors, alignment errors, annotation errors and other complicating factors are present. These results point to the pressing need for further algorithm development.

AVAILABILITY AND IMPLEMENTATION

Simulated datasets and other supporting information can be found at http://bioinf.itmat.upenn.edu/BEERS/bp2.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

由于RNA测序(RNA-Seq)相对于微阵列的优势,它在高度并行的基因表达分析中越来越受欢迎。例如,RNA-Seq有望能够准确识别和定量全长剪接形式。为此已经开发了许多信息学软件包,但短读长在原则上使其成为一个难题。测序错误和多态性增加了进一步的复杂性。有必要进行研究以确定哪些算法表现最佳,以及是否有任何算法表现足够好。然而,缺乏独立且无偏见的基准研究。在这里,我们采用一种使用模拟和实验基准数据的方法来评估它们的准确性。

结果

我们得出结论,即使使用理想化数据,大多数方法也不准确,并且一旦存在多种剪接形式、多态性、内含子信号、测序错误、比对错误、注释错误和其他复杂因素,就没有方法是高度准确的。这些结果表明迫切需要进一步开发算法。

可用性和实现

模拟数据集和其他支持信息可在http://bioinf.itmat.upenn.edu/BEERS/bp2找到。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/f49b083a3140/btv488f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/1c1d26c140c5/btv488f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/d101ca3785a2/btv488f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/605b4469cb05/btv488f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/d674fbe9c98f/btv488f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/f595d911111c/btv488f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/d102154c6e1b/btv488f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/f49b083a3140/btv488f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/1c1d26c140c5/btv488f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/d101ca3785a2/btv488f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/605b4469cb05/btv488f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/d674fbe9c98f/btv488f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/f595d911111c/btv488f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/d102154c6e1b/btv488f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3df0/4673975/f49b083a3140/btv488f7p.jpg

相似文献

1
Benchmark analysis of algorithms for determining and quantifying full-length mRNA splice forms from RNA-seq data.用于从RNA测序数据中确定和定量全长mRNA剪接形式的算法的基准分析。
Bioinformatics. 2015 Dec 15;31(24):3938-45. doi: 10.1093/bioinformatics/btv488. Epub 2015 Sep 3.
2
TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference.TIGAR:一种通过变分贝叶斯推断进行 RNA-Seq 数据缺口对齐的转录本丰度估计方法。
Bioinformatics. 2013 Sep 15;29(18):2292-9. doi: 10.1093/bioinformatics/btt381. Epub 2013 Jul 2.
3
Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM).RNA-Seq 比对算法与 RNA-Seq 统一映射器(RUM)的比较分析。
Bioinformatics. 2011 Sep 15;27(18):2518-28. doi: 10.1093/bioinformatics/btr427. Epub 2011 Jul 19.
4
SSP: an interval integer linear programming for de novo transcriptome assembly and isoform discovery of RNA-seq reads.SSP:一种用于 RNA-seq reads 从头转录组组装和异构体发现的区间整数线性规划方法。
Genomics. 2013 Nov-Dec;102(5-6):507-14. doi: 10.1016/j.ygeno.2013.10.003. Epub 2013 Oct 23.
5
Updating RNA-Seq analyses after re-annotation.重新注释后更新 RNA-Seq 分析。
Bioinformatics. 2013 Jul 1;29(13):1631-7. doi: 10.1093/bioinformatics/btt197. Epub 2013 May 14.
6
ORMAN: optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms.ORMAN:在存在新的异构体的情况下,实现 RNA-Seq 多重比对的最佳分辨率。
Bioinformatics. 2014 Mar 1;30(5):644-51. doi: 10.1093/bioinformatics/btt591. Epub 2013 Oct 15.
7
CLASS: constrained transcript assembly of RNA-seq reads.类:RNA-seq 读段的约束转录本组装。
BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S14. doi: 10.1186/1471-2105-14-S5-S14. Epub 2013 Apr 10.
8
EBARDenovo: highly accurate de novo assembly of RNA-Seq with efficient chimera-detection.EBARDenovo:具有高效嵌合体检测功能的 RNA-Seq 从头组装的高度精确性。
Bioinformatics. 2013 Apr 15;29(8):1004-10. doi: 10.1093/bioinformatics/btt092. Epub 2013 Mar 1.
9
PennDiff: detecting differential alternative splicing and transcription by RNA sequencing.PennDiff:通过 RNA 测序检测差异剪接和转录。
Bioinformatics. 2018 Jul 15;34(14):2384-2391. doi: 10.1093/bioinformatics/bty097.
10
Quantitative visualization of alternative exon expression from RNA-seq data.基于RNA测序数据的可变外显子表达的定量可视化
Bioinformatics. 2015 Jul 15;31(14):2400-2. doi: 10.1093/bioinformatics/btv034. Epub 2015 Jan 22.

引用本文的文献

1
A Deep Dive into Statistical Modeling of RNA Splicing QTLs Reveals New Variants that Explain Neurodegenerative Disease.深入探究RNA剪接数量性状基因座的统计模型揭示了解释神经退行性疾病的新变异体。
bioRxiv. 2024 Sep 3:2024.09.01.610696. doi: 10.1101/2024.09.01.610696.
2
Transcriptomic signatures across a critical sedimentation threshold in a major reef-building coral.主要造礁珊瑚中跨越关键沉降阈值的转录组特征
Front Physiol. 2024 Jun 11;15:1303681. doi: 10.3389/fphys.2024.1303681. eCollection 2024.
3
High Overexpression of Leads to Growth Inhibition and Protein Ectopic Localization in Transgenic .

本文引用的文献

1
StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.StringTie能够从RNA测序读数中更完善地重建转录组。
Nat Biotechnol. 2015 Mar;33(3):290-5. doi: 10.1038/nbt.3122. Epub 2015 Feb 18.
2
A circadian gene expression atlas in mammals: implications for biology and medicine.哺乳动物的昼夜节律基因表达图谱:对生物学和医学的启示。
Proc Natl Acad Sci U S A. 2014 Nov 11;111(45):16219-24. doi: 10.1073/pnas.1408886111. Epub 2014 Oct 27.
3
HTSeq--a Python framework to work with high-throughput sequencing data.
高表达导致转基因中的生长抑制和蛋白异位定位。
Int J Mol Sci. 2024 May 27;25(11):5840. doi: 10.3390/ijms25115840.
4
ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs.ClusTrast:一种基于聚类 contigs 的短读从头转录本异构体组装工具。
BMC Bioinformatics. 2024 Feb 1;25(1):54. doi: 10.1186/s12859-024-05663-3.
5
Challenges and best practices in omics benchmarking.组学基准测试中的挑战和最佳实践。
Nat Rev Genet. 2024 May;25(5):326-339. doi: 10.1038/s41576-023-00679-6. Epub 2024 Jan 12.
6
Alternative splicing analysis benchmark with DICAST.使用DICAST进行可变剪接分析基准测试。
NAR Genom Bioinform. 2023 May 30;5(2):lqad044. doi: 10.1093/nargab/lqad044. eCollection 2023 Jun.
7
A study of differential microRNA expression profile in migraine: the microMIG exploratory study.偏头痛差异 microRNA 表达谱研究: microMIG 探索性研究。
J Headache Pain. 2023 Feb 17;24(1):11. doi: 10.1186/s10194-023-01542-z.
8
The hitchhikers' guide to RNA sequencing and functional analysis.RNA 测序和功能分析的搭便车指南。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac529.
9
The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma.未知RNA序列对肺腺癌肿瘤特征的贡献。
NAR Cancer. 2022 Feb 1;4(1):zcac001. doi: 10.1093/narcan/zcac001. eCollection 2022 Mar.
10
Experimental Design for Time-Series RNA-Seq Analysis of Gene Expression and Alternative Splicing.用于基因表达和可变剪接的时间序列RNA测序分析的实验设计
Methods Mol Biol. 2022;2398:173-188. doi: 10.1007/978-1-0716-1912-4_14.
HTSeq——一个用于处理高通量测序数据的Python框架。
Bioinformatics. 2015 Jan 15;31(2):166-9. doi: 10.1093/bioinformatics/btu638. Epub 2014 Sep 25.
4
IVT-seq reveals extreme bias in RNA sequencing.体外转录测序(IVT-seq)揭示了RNA测序中的极端偏差。
Genome Biol. 2014 Jun 30;15(6):R86. doi: 10.1186/gb-2014-15-6-r86.
5
Efficient RNA isoform identification and quantification from RNA-Seq data with network flows.利用网络流从RNA测序数据中高效鉴定和定量RNA异构体
Bioinformatics. 2014 Sep 1;30(17):2447-55. doi: 10.1093/bioinformatics/btu317. Epub 2014 May 9.
6
SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads.SOAPdenovo-Trans:基于短 RNA-Seq 数据的 de novo 转录组组装。
Bioinformatics. 2014 Jun 15;30(12):1660-6. doi: 10.1093/bioinformatics/btu077. Epub 2014 Feb 13.
7
PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution.PennSeq:通过建模非均匀读取分布实现 RNA-Seq 中精确的异构体特异性基因表达定量。
Nucleic Acids Res. 2014 Feb;42(3):e20. doi: 10.1093/nar/gkt1304. Epub 2013 Dec 20.
8
RefSeq: an update on mammalian reference sequences.RefSeq:哺乳动物参考序列的更新。
Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. doi: 10.1093/nar/gkt1114. Epub 2013 Nov 19.
9
Assessment of transcript reconstruction methods for RNA-seq.RNA-seq 转录本重构方法评估。
Nat Methods. 2013 Dec;10(12):1177-84. doi: 10.1038/nmeth.2714. Epub 2013 Nov 3.
10
Systematic evaluation of spliced alignment programs for RNA-seq data.系统评估 RNA-seq 数据拼接比对程序。
Nat Methods. 2013 Dec;10(12):1185-91. doi: 10.1038/nmeth.2722. Epub 2013 Nov 3.