• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

短读长和长读长RNA测序读取所捕获的转录组复杂性的对比与整合

Contrasting and Combining Transcriptome Complexity Captured by Short and Long RNA Sequencing Reads.

作者信息

Han Seong Woo, Jewell San, Thomas-Tikhonenko Andrei, Barash Yoseph

机构信息

Department of Computer and Information Sciences, School of Engineering, University of Pennsylvania.

Department of Genetics, Perelman School of Medicine, University of Pennsylvania.

出版信息

bioRxiv. 2023 Nov 21:2023.11.21.568046. doi: 10.1101/2023.11.21.568046.

DOI:10.1101/2023.11.21.568046
PMID:38045232
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10690182/
Abstract

Mapping transcriptomic variations using either short or long reads RNA sequencing is a staple of genomic research. Long reads are able to capture entire isoforms and overcome repetitive regions, while short reads still provides improved coverage and error rates. Yet how to quantitatively compare the technologies, can we combine those, and what may be the benefit of such a combined view remain open questions. We tackle these questions by first creating a pipeline to assess matched long and short reads data using a variety of transcriptome statistics. We find that across datasets, algorithms and technologies, matched short reads data detects roughly 50% more splice junctions, with 10-30% of the splice junctions included at 20% or more are missed by long reads. In contrast, long reads detect many more intron retention events, pointing to the benefit of combining the technologies. We introduce MAJIQ-L, an extension of the MAJIQ software to enable a unified view of transcriptome variations from both technologies and demonstrate its benefits. Our software can be used to assess any future long reads technology or algorithm, and combine it with short reads data for improved transcriptome analysis.

摘要

使用短读长或长读长RNA测序来绘制转录组变异图谱是基因组研究的一项主要内容。长读长能够捕获完整的异构体并克服重复区域,而短读长仍能提供更高的覆盖率和更低的错误率。然而,如何定量比较这些技术、能否将它们结合起来以及这种综合观点可能带来什么好处,仍然是悬而未决的问题。我们通过首先创建一个管道来解决这些问题,该管道使用各种转录组统计数据来评估匹配的长读长和短读长数据。我们发现,在不同的数据集、算法和技术中,匹配的短读长数据检测到的剪接位点大约多50%,长读长会遗漏20%或更多的剪接位点中的10 - 30%。相比之下,长读长检测到更多的内含子保留事件,这表明结合这些技术是有好处的。我们引入了MAJIQ-L,它是MAJIQ软件的扩展,能够对来自这两种技术的转录组变异进行统一查看,并展示了其优势。我们的软件可用于评估任何未来的长读长技术或算法,并将其与短读长数据结合起来以改进转录组分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/6c89b2227d42/nihpp-2023.11.21.568046v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/4305897c1140/nihpp-2023.11.21.568046v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/e804464324d0/nihpp-2023.11.21.568046v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/c06bf7b3e571/nihpp-2023.11.21.568046v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/ab81ca5fd06e/nihpp-2023.11.21.568046v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/3a7cfb702b01/nihpp-2023.11.21.568046v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/6c89b2227d42/nihpp-2023.11.21.568046v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/4305897c1140/nihpp-2023.11.21.568046v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/e804464324d0/nihpp-2023.11.21.568046v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/c06bf7b3e571/nihpp-2023.11.21.568046v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/ab81ca5fd06e/nihpp-2023.11.21.568046v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/3a7cfb702b01/nihpp-2023.11.21.568046v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428a/10690182/6c89b2227d42/nihpp-2023.11.21.568046v1-f0006.jpg

相似文献

1
Contrasting and Combining Transcriptome Complexity Captured by Short and Long RNA Sequencing Reads.短读长和长读长RNA测序读取所捕获的转录组复杂性的对比与整合
bioRxiv. 2023 Nov 21:2023.11.21.568046. doi: 10.1101/2023.11.21.568046.
2
Contrasting and combining transcriptome complexity captured by short and long RNA sequencing reads.短读长读 RNA 测序捕获的转录组复杂性的对比和组合。
Genome Res. 2024 Oct 29;34(10):1624-1635. doi: 10.1101/gr.278659.123.
3
Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms using long-read sequencing.弗雷迪:使用长读测序进行注释独立的转录组可变剪接异构体的检测和发现。
Nucleic Acids Res. 2023 Jan 25;51(2):e11. doi: 10.1093/nar/gkac1112.
4
A Full-Length mRNA Transcriptome Generated From Hybrid-Corrected PacBio Long-Reads Improves the Transcript Annotation and Identifies Thousands of Novel Splice Variants in Atlantic Salmon.通过混合校正的PacBio长读长生成的全长mRNA转录组改善了转录本注释并鉴定了大西洋鲑鱼中数千种新的剪接变体。
Front Genet. 2021 Apr 27;12:656334. doi: 10.3389/fgene.2021.656334. eCollection 2021.
5
Extending rnaSPAdes functionality for hybrid transcriptome assembly.扩展 rnaSPAdes 功能以进行混合转录组组装。
BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):302. doi: 10.1186/s12859-020-03614-2.
6
Improved transcriptome assembly using a hybrid of long and short reads with StringTie.使用长读长和短读长混合的方法进行转录组组装,可提高组装质量。
PLoS Comput Biol. 2022 Jun 1;18(6):e1009730. doi: 10.1371/journal.pcbi.1009730. eCollection 2022 Jun.
7
Event Analysis: Using Transcript Events To Improve Estimates of Abundance in RNA-seq Data.事件分析:利用转录本事件改进RNA测序数据中丰度的估计
G3 (Bethesda). 2018 Aug 30;8(9):2923-2940. doi: 10.1534/g3.118.200373.
8
Direct full-length RNA sequencing reveals unexpected transcriptome complexity during development.直接全长 RNA 测序揭示了发育过程中意想不到的转录组复杂性。
Genome Res. 2020 Feb;30(2):287-298. doi: 10.1101/gr.251512.119. Epub 2020 Feb 5.
9
Optimal spliced alignments of short sequence reads.短序列 reads 的最优剪接比对。
Bioinformatics. 2008 Aug 15;24(16):i174-80. doi: 10.1093/bioinformatics/btn300.
10
A hybrid and scalable error correction algorithm for indel and substitution errors of long reads.一种用于长读段插入/缺失和替换错误的混合可扩展纠错算法。
BMC Genomics. 2019 Dec 20;20(Suppl 11):948. doi: 10.1186/s12864-019-6286-9.

本文引用的文献

1
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.系统评估长读 RNA-seq 方法在转录本鉴定和定量中的应用。
Nat Methods. 2024 Jul;21(7):1349-1363. doi: 10.1038/s41592-024-02298-3. Epub 2024 Jun 7.
2
Context-aware transcript quantification from long-read RNA-seq data with Bambu.使用 Bambu 从长读 RNA-seq 数据中进行上下文感知的转录本定量。
Nat Methods. 2023 Aug;20(8):1187-1195. doi: 10.1038/s41592-023-01908-w. Epub 2023 Jun 12.
3
RNA splicing analysis using heterogeneous and large RNA-seq datasets.
使用异质和大型 RNA-seq 数据集进行 RNA 剪接分析。
Nat Commun. 2023 Mar 3;14(1):1230. doi: 10.1038/s41467-023-36585-y.
4
ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data.ESPRESSO:从易错的长读 RNA-seq 数据中稳健地发现和定量转录本异构体。
Sci Adv. 2023 Jan 20;9(3):eabq5072. doi: 10.1126/sciadv.abq5072.
5
Method of the year: long-read sequencing.年度方法:长读长测序。
Nat Methods. 2023 Jan;20(1):6-11. doi: 10.1038/s41592-022-01730-w.
6
Approaching complete genomes, transcriptomes and epi-omes with accurate long-read sequencing.采用准确的长读测序技术获取完整的基因组、转录组和表观基因组。
Nat Methods. 2023 Jan;20(1):12-16. doi: 10.1038/s41592-022-01716-8.
7
Accurate isoform discovery with IsoQuant using long reads.利用长读长 IsoQuant 进行准确的异构体发现。
Nat Biotechnol. 2023 Jul;41(7):915-918. doi: 10.1038/s41587-022-01565-y. Epub 2023 Jan 2.
8
Transcriptome variation in human tissues revealed by long-read sequencing.长读测序揭示人类组织中的转录组变异。
Nature. 2022 Aug;608(7922):353-359. doi: 10.1038/s41586-022-05035-y. Epub 2022 Aug 3.
9
Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns.使用 Pacific Biosciences 和 Oxford Nanopore Technologies 对个体条形码 cDNA 进行测序可揭示特定于平台的错误模式。
Genome Res. 2022 Apr;32(4):726-737. doi: 10.1101/gr.276405.121. Epub 2022 Mar 17.
10
Modulation of CD22 Protein Expression in Childhood Leukemia by Pervasive Splicing Aberrations: Implications for CD22-Directed Immunotherapies.通过广泛的剪接异常调节儿童白血病中的 CD22 蛋白表达:对 CD22 导向免疫治疗的影响。
Blood Cancer Discov. 2022 Mar 1;3(2):103-115. doi: 10.1158/2643-3230.BCD-21-0087.