Suppr超能文献

通过长度对 RNA 进行分区可提高基于短读长 RNA-seq 数据的转录组重构。

Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data.

机构信息

Gene Center, Ludwig-Maximilians-Universität München, Munich, Germany.

Department of Biochemistry & Biophysics, University of California, San Francisco, San Francisco, CA, USA.

出版信息

Nat Biotechnol. 2022 May;40(5):741-750. doi: 10.1038/s41587-021-01136-7. Epub 2022 Jan 10.

Abstract

The accuracy of methods for assembling transcripts from short-read RNA sequencing data is limited by the lack of long-range information. Here we introduce Ladder-seq, an approach that separates transcripts according to their lengths before sequencing and uses the additional information to improve the quantification and assembly of transcripts. Using simulated data, we show that a kallisto algorithm extended to process Ladder-seq data quantifies transcripts of complex genes with substantially higher accuracy than conventional kallisto. For reference-based assembly, a tailored scheme based on the StringTie2 algorithm reconstructs a single transcript with 30.8% higher precision than its conventional counterpart and is more than 30% more sensitive for complex genes. For de novo assembly, a similar scheme based on the Trinity algorithm correctly assembles 78% more transcripts than conventional Trinity while improving precision by 78%. In experimental data, Ladder-seq reveals 40% more genes harboring isoform switches compared to conventional RNA sequencing and unveils widespread changes in isoform usage upon mA depletion by Mettl14 knockout.

摘要

从短读 RNA 测序数据组装转录本的方法的准确性受到缺乏长程信息的限制。在这里,我们介绍了 Ladder-seq 方法,该方法在测序前根据转录本的长度对其进行分离,并利用额外的信息来提高转录本的定量和组装质量。使用模拟数据,我们表明,一种扩展到处理 Ladder-seq 数据的 kallisto 算法对复杂基因的转录本进行定量,其准确性要比传统的 kallisto 算法高得多。对于基于参考的组装,基于 StringTie2 算法的定制方案构建的单个转录本的精度比其常规对应物高 30.8%,并且对复杂基因的灵敏度提高了 30%以上。对于从头组装,基于 Trinity 算法的类似方案比传统 Trinity 正确组装的转录本多 78%,同时精度提高了 78%。在实验数据中,与传统的 RNA 测序相比,Ladder-seq 揭示了 40%更多的基因含有异构体开关,并且在 Mettl14 敲除导致 mA 耗竭时,揭示了异构体使用的广泛变化。

相似文献

1
Partitioning RNAs by length improves transcriptome reconstruction from short-read RNA-seq data.
Nat Biotechnol. 2022 May;40(5):741-750. doi: 10.1038/s41587-021-01136-7. Epub 2022 Jan 10.
2
Transcript Profiling Using Long-Read Sequencing Technologies.
Methods Mol Biol. 2018;1783:121-147. doi: 10.1007/978-1-4939-7834-2_6.
3
Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity.
BMC Genomics. 2024 Jan 29;25(1):122. doi: 10.1186/s12864-024-10021-x.
4
ClusTrast: a short read de novo transcript isoform assembler guided by clustered contigs.
BMC Bioinformatics. 2024 Feb 1;25(1):54. doi: 10.1186/s12859-024-05663-3.
6
Transcript Identification Through Long-Read Sequencing.
Methods Mol Biol. 2021;2284:531-541. doi: 10.1007/978-1-0716-1307-8_29.
7
Current and Future Methods for mRNA Analysis: A Drive Toward Single Molecule Sequencing.
Methods Mol Biol. 2018;1783:209-241. doi: 10.1007/978-1-4939-7834-2_11.
8
CIDANE: comprehensive isoform discovery and abundance estimation.
Genome Biol. 2016 Jan 30;17:16. doi: 10.1186/s13059-015-0865-0.
9
Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.
BMC Bioinformatics. 2011 Dec 14;12 Suppl 14(Suppl 14):S2. doi: 10.1186/1471-2105-12-S14-S2.
10
Merging short and stranded long reads improves transcript assembly.
PLoS Comput Biol. 2023 Oct 26;19(10):e1011576. doi: 10.1371/journal.pcbi.1011576. eCollection 2023 Oct.

引用本文的文献

1
Environmental community transcriptomics: strategies and struggles.
Brief Funct Genomics. 2025 Jan 15;24. doi: 10.1093/bfgp/elae033.
2
Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq.
Nat Biotechnol. 2025 Apr;43(4):635-646. doi: 10.1038/s41587-024-02245-9. Epub 2024 May 22.
4
Nanopore sequencing unveils the complexity of the cold-activated murine brown adipose tissue transcriptome.
iScience. 2023 Jun 23;26(8):107190. doi: 10.1016/j.isci.2023.107190. eCollection 2023 Aug 18.
5
Transcriptomics for Clinical and Experimental Biology Research: Hang on a Seq.
Adv Genet (Hoboken). 2023 Jan 17;4(2):2200024. doi: 10.1002/ggn2.202200024. eCollection 2023 Jun.
6
Epitranscriptomic regulation of cortical neurogenesis via Mettl8-dependent mitochondrial tRNA mC modification.
Cell Stem Cell. 2023 Mar 2;30(3):300-311.e11. doi: 10.1016/j.stem.2023.01.007. Epub 2023 Feb 9.
7
ScanExitronLR: characterization and quantification of exitron splicing events in long-read RNA-seq data.
Bioinformatics. 2022 Oct 31;38(21):4966-4968. doi: 10.1093/bioinformatics/btac626.

本文引用的文献

1
Transcriptome variation in human tissues revealed by long-read sequencing.
Nature. 2022 Aug;608(7922):353-359. doi: 10.1038/s41586-022-05035-y. Epub 2022 Aug 3.
2
The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools.
NAR Genom Bioinform. 2021 Apr 26;3(2):lqab028. doi: 10.1093/nargab/lqab028. eCollection 2021 Jun.
3
McSplicer: a probabilistic model for estimating splice site usage from RNA-seq data.
Bioinformatics. 2021 Aug 4;37(14):2004–2011. doi: 10.1093/bioinformatics/btab050. Epub 2021 Jan 30.
6
Direct full-length RNA sequencing reveals unexpected transcriptome complexity during development.
Genome Res. 2020 Feb;30(2):287-298. doi: 10.1101/gr.251512.119. Epub 2020 Feb 5.
7
Transcriptome assembly from long-read RNA-seq alignments with StringTie2.
Genome Biol. 2019 Dec 16;20(1):278. doi: 10.1186/s13059-019-1910-1.
8
AIDE: annotation-assisted isoform discovery with high precision.
Genome Res. 2019 Dec;29(12):2056-2072. doi: 10.1101/gr.251108.119. Epub 2019 Nov 6.
9
A multi-sample approach increases the accuracy of transcript assembly.
Nat Commun. 2019 Nov 1;10(1):5000. doi: 10.1038/s41467-019-12990-0.
10
Regulation of Co-transcriptional Pre-mRNA Splicing by mA through the Low-Complexity Protein hnRNPG.
Mol Cell. 2019 Oct 3;76(1):70-81.e9. doi: 10.1016/j.molcel.2019.07.005. Epub 2019 Aug 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验