基于计算机模拟混合物对长读 RNA 测序分析工具进行基准测试。

Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures.

机构信息

The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.

Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia.

出版信息

Nat Methods. 2023 Nov;20(11):1810-1821. doi: 10.1038/s41592-023-02026-3. Epub 2023 Oct 2.

DOI:10.1038/s41592-023-02026-3

PMID:37783886

Abstract

The lack of benchmark data sets with inbuilt ground-truth makes it challenging to compare the performance of existing long-read isoform detection and differential expression analysis workflows. Here, we present a benchmark experiment using two human lung adenocarcinoma cell lines that were each profiled in triplicate together with synthetic, spliced, spike-in RNAs (sequins). Samples were deeply sequenced on both Illumina short-read and Oxford Nanopore Technologies long-read platforms. Alongside the ground-truth available via the sequins, we created in silico mixture samples to allow performance assessment in the absence of true positives or true negatives. Our results show that StringTie2 and bambu outperformed other tools from the six isoform detection tools tested, DESeq2, edgeR and limma-voom were best among the five differential transcript expression tools tested and there was no clear front-runner for performing differential transcript usage analysis between the five tools compared, which suggests further methods development is needed for this application.

摘要

缺乏具有内置真实数据的基准数据集使得比较现有的长读长片段检测和差异表达分析工作流程的性能变得具有挑战性。在这里，我们使用两个人类肺腺癌细胞系进行了基准实验，每个细胞系均进行了三次重复 profiling，同时还使用了合成的、拼接的、 Spike-in RNA（Sequins）。样品在 Illumina 短读长和 Oxford Nanopore Technologies 长读长平台上进行了深度测序。除了通过 Sequins 获得的真实数据之外，我们还创建了虚拟混合样本，以在没有真正的阳性或阴性对照的情况下进行性能评估。我们的结果表明，StringTie2 和 bambu 在六种检测工具中表现优于其他工具，DESeq2、edgeR 和 limma-voom 在五种差异转录表达工具中表现最好，在五种比较工具中进行差异转录物使用分析方面没有明显的领先者，这表明该应用程序需要进一步的方法开发。

相似文献

Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures.

Nat Methods. 2023 Nov;20(11):1810-1821. doi: 10.1038/s41592-023-02026-3. Epub 2023 Oct 2.

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools.

NAR Genom Bioinform. 2021 Apr 26;3(2):lqab028. doi: 10.1093/nargab/lqab028. eCollection 2021 Jun.

Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data.

Nat Commun. 2024 May 10;15(1):3972. doi: 10.1038/s41467-024-48117-3.

A benchmarking of workflows for detecting differential splicing and differential expression at isoform level in human RNA-seq studies.

Brief Bioinform. 2019 Mar 22;20(2):471-481. doi: 10.1093/bib/bbx122.

Transcript Profiling Using Long-Read Sequencing Technologies.

Methods Mol Biol. 2018;1783:121-147. doi: 10.1007/978-1-4939-7834-2_6.

PSI-Sigma: a comprehensive splicing-detection method for short-read and long-read RNA-seq analysis.

Bioinformatics. 2019 Dec 1;35(23):5048-5054. doi: 10.1093/bioinformatics/btz438.

Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA.

Proc Natl Acad Sci U S A. 2018 Sep 25;115(39):9726-9731. doi: 10.1073/pnas.1806447115. Epub 2018 Sep 10.

Identification of Protein Isoforms Using Reference Databases Built from Long and Short Read RNA-Sequencing.

J Proteome Res. 2022 Jul 1;21(7):1628-1639. doi: 10.1021/acs.jproteome.1c00968. Epub 2022 May 25.

Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms using long-read sequencing.

Nucleic Acids Res. 2023 Jan 25;51(2):e11. doi: 10.1093/nar/gkac1112.

Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data.

PLoS One. 2020 Apr 30;15(4):e0232271. doi: 10.1371/journal.pone.0232271. eCollection 2020.

引用本文的文献

Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner.

Genome Biol. 2025 Aug 28;26(1):257. doi: 10.1186/s13059-025-03723-2.

Isoform-level profiling of mA epitranscriptomic signatures in human brain.

Sci Adv. 2025 Aug 8;11(32):eadp0783. doi: 10.1126/sciadv.adp0783.

Long-read RNA sequencing unveils a novel cryptic exon in MNAT1 along with its full-length transcript structure in TDP-43 proteinopathy.

Commun Biol. 2025 Jul 16;8(1):1056. doi: 10.1038/s42003-025-08463-4.

Reassessing lidocaine as an electroporation sensitizer in vitro.

Sci Rep. 2025 Jul 15;15(1):25593. doi: 10.1038/s41598-025-11695-3.

Excitation and electroporation in genetically engineered excitable S-HEK cells exposed to electric pulses of different durations.

Sci Rep. 2025 Jul 2;15(1):23451. doi: 10.1038/s41598-025-06989-5.

Quantitative isoform profiling using deep coverage long-read RNA sequencing across early endothelial differentiation.

bioRxiv. 2025 Jun 2:2025.05.30.656561. doi: 10.1101/2025.05.30.656561.

Investigating RNA dynamics from single molecule transcriptomes.

Trends Genet. 2025 Jun 4. doi: 10.1016/j.tig.2025.05.001.

Improving gene isoform quantification with miniQuant.

Nat Biotechnol. 2025 Jun 3. doi: 10.1038/s41587-025-02633-9.

A tumor necrosis factor-α-responsive cryptic promoter drives overexpression of the human endogenous retrovirus ERVK-7.

J Biol Chem. 2025 Apr 30;301(6):108568. doi: 10.1016/j.jbc.2025.108568.

Definer: A computational method for accurate identification of RNA pseudouridine sites based on deep learning.

PLoS One. 2025 Apr 24;20(4):e0320077. doi: 10.1371/journal.pone.0320077. eCollection 2025.

本文引用的文献

: Scalable analysis of differential transcript usage for bulk and single-cell RNA-sequencing applications.

F1000Res. 2021 May 11;10:374. doi: 10.12688/f1000research.51749.2. eCollection 2021.

Improved transcriptome assembly using a hybrid of long and short reads with StringTie.

PLoS Comput Biol. 2022 Jun 1;18(6):e1009730. doi: 10.1371/journal.pcbi.1009730. eCollection 2022 Jun.

Accurate expression quantification from nanopore direct RNA sequencing with NanoCount.

Nucleic Acids Res. 2022 Feb 28;50(4):e19. doi: 10.1093/nar/gkab1129.

Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing.

Genome Biol. 2021 Nov 11;22(1):310. doi: 10.1186/s13059-021-02525-6.

Inflammation drives alternative first exon usage to regulate immune genes including a novel iron-regulated isoform of .

Elife. 2021 May 28;10:e69431. doi: 10.7554/eLife.69431.

Generation of an isoform-level transcriptome atlas of macrophage activation.

J Biol Chem. 2021 Jan-Jun;296:100784. doi: 10.1016/j.jbc.2021.100784. Epub 2021 May 14.

The long and the short of it: unlocking nanopore long-read RNA sequencing data with short-read differential expression analysis tools.

NAR Genom Bioinform. 2021 Apr 26;3(2):lqab028. doi: 10.1093/nargab/lqab028. eCollection 2021 Jun.

Twelve years of SAMtools and BCFtools.

Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.

Complete characterization of the human immune cell transcriptome using accurate full-length cDNA sequencing.

Genome Res. 2020 Apr;30(4):589-601. doi: 10.1101/gr.257188.119. Epub 2020 Apr 20.

Full-length transcript characterization of SF3B1 mutation in chronic lymphocytic leukemia reveals downregulation of retained introns.

Nat Commun. 2020 Mar 18;11(1):1438. doi: 10.1038/s41467-020-15171-6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于计算机模拟混合物对长读 RNA 测序分析工具进行基准测试。

Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures.

机构信息

The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia.

Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia.

出版信息

Nat Methods. 2023 Nov;20(11):1810-1821. doi: 10.1038/s41592-023-02026-3. Epub 2023 Oct 2.

DOI:10.1038/s41592-023-02026-3

PMID:37783886

Abstract

摘要

基于计算机模拟混合物对长读 RNA 测序分析工具进行基准测试。

Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于计算机模拟混合物对长读 RNA 测序分析工具进行基准测试。

Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献