isONform：一种无需参考基因组的转录组重构方法，来自牛津纳米孔技术。

isONform: reference-free transcriptome reconstruction from Oxford Nanopore data.

机构信息

Department of Mathematics, Science for Life Laboratory, Stockholm University, Stockholm 106 91, Sweden.

出版信息

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i222-i231. doi: 10.1093/bioinformatics/btad264.

DOI:10.1093/bioinformatics/btad264

PMID:37387174

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10311309/

Abstract

MOTIVATION

With advances in long-read transcriptome sequencing, we can now fully sequence transcripts, which greatly improves our ability to study transcription processes. A popular long-read transcriptome sequencing technique is Oxford Nanopore Technologies (ONT), which through its cost-effective sequencing and high throughput, has the potential to characterize the transcriptome in a cell. However, due to transcript variability and sequencing errors, long cDNA reads need substantial bioinformatic processing to produce a set of isoform predictions from the reads. Several genome and annotation-based methods exist to produce transcript predictions. However, such methods require high-quality genomes and annotations and are limited by the accuracy of long-read splice aligners. In addition, gene families with high heterogeneity may not be well represented by a reference genome and would benefit from reference-free analysis. Reference-free methods to predict transcripts from ONT, such as RATTLE, exist, but their sensitivity is not comparable to reference-based approaches.

RESULTS

We present isONform, a high-sensitivity algorithm to construct isoforms from ONT cDNA sequencing data. The algorithm is based on iterative bubble popping on gene graphs built from fuzzy seeds from the reads. Using simulated, synthetic, and biological ONT cDNA data, we show that isONform has substantially higher sensitivity than RATTLE albeit with some loss in precision. On biological data, we show that isONform's predictions have substantially higher consistency with the annotation-based method StringTie2 compared with RATTLE. We believe isONform can be used both for isoform construction for organisms without well-annotated genomes and as an orthogonal method to verify predictions of reference-based methods.

AVAILABILITY AND IMPLEMENTATION

https://github.com/aljpetri/isONform.

摘要

动机

随着长读转录组测序技术的进步，我们现在可以完整地测序转录本，这极大地提高了我们研究转录过程的能力。一种流行的长读转录组测序技术是牛津纳米孔技术（ONT），它通过具有成本效益的测序和高通量，具有在细胞中表征转录组的潜力。然而，由于转录本的可变性和测序错误，长 cDNA 读取需要大量的生物信息学处理，才能从读取中产生一组异构体预测。有几种基于基因组和注释的方法可以产生转录本预测。然而，这种方法需要高质量的基因组和注释，并且受到长读拼接对齐器准确性的限制。此外，具有高度异质性的基因家族可能无法被参考基因组很好地表示，并且将受益于无参考分析。存在用于从 ONT 预测转录本的无参考方法，例如 RATTLE，但它们的灵敏度无法与基于参考的方法相比。

结果

我们提出了 isONform，这是一种从 ONT cDNA 测序数据构建异构体的高灵敏度算法。该算法基于从读取的模糊种子构建的基因图上的迭代气泡弹出。使用模拟、合成和生物 ONT cDNA 数据，我们表明，尽管精度略有损失，但与 RATTLE 相比，isONform 的灵敏度大大提高。在生物数据上，我们表明与 RATTLE 相比，isONform 的预测与基于注释的方法 StringTie2 具有更高的一致性。我们相信 isONform 既可以用于构建没有良好注释基因组的生物体的异构体，也可以作为验证基于参考的方法预测的正交方法。

可用性和实现

https://github.com/aljpetri/isONform。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21ad/10311309/564b81fdf221/btad264f1.jpg

相似文献

isONform: reference-free transcriptome reconstruction from Oxford Nanopore data.isONform：一种无需参考基因组的转录组重构方法，来自牛津纳米孔技术。

Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i222-i231. doi: 10.1093/bioinformatics/btad264.

UNAGI: Yeast Transcriptome Reconstruction and Gene Discovery Using Nanopore Sequencing.利用纳米孔测序进行酵母转录组重构和基因发现。

Methods Mol Biol. 2022;2477:79-89. doi: 10.1007/978-1-0716-2257-5_6.

Error correction enables use of Oxford Nanopore technology for reference-free transcriptome analysis.纠错功能使牛津纳米孔技术能够用于无参考转录组分析。

Nat Commun. 2021 Jan 4;12(1):2. doi: 10.1038/s41467-020-20340-8.

Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA.利用 R2C2 方法提高纳米孔读取准确性，使高度多重化全长单细胞 cDNA 的测序成为可能。

Proc Natl Acad Sci U S A. 2018 Sep 25;115(39):9726-9731. doi: 10.1073/pnas.1806447115. Epub 2018 Sep 10.

Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore.比较两种最新的基因组组装测序技术：太平洋生物科学测序仪二代系统的 HiFi 读取和牛津纳米孔的超长读取。

Gigascience. 2020 Dec 15;9(12). doi: 10.1093/gigascience/giaa123.

Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns.使用 Pacific Biosciences 和 Oxford Nanopore Technologies 对个体条形码 cDNA 进行测序可揭示特定于平台的错误模式。

Genome Res. 2022 Apr;32(4):726-737. doi: 10.1101/gr.276405.121. Epub 2022 Mar 17.

RATTLE: reference-free reconstruction and quantification of transcriptomes from Nanopore sequencing.RATTLE：基于纳米孔测序的无参转录组重构和定量分析。

Genome Biol. 2022 Jul 8;23(1):153. doi: 10.1186/s13059-022-02715-w.

Depletion of Hemoglobin Transcripts and Long-Read Sequencing Improves the Transcriptome Annotation of the Polar Bear ().血红蛋白转录本的耗尽与长读长测序改善了北极熊的转录组注释（）。

Front Genet. 2019 Jul 19;10:643. doi: 10.3389/fgene.2019.00643. eCollection 2019.

Advancing metagenome-assembled genome-based pathogen identification: unraveling the power of long-read assembly algorithms in Oxford Nanopore sequencing.推进宏基因组组装基因组为基础的病原体鉴定：揭示长读长组装算法在牛津纳米孔测序中的强大功能。

Microbiol Spectr. 2024 Jun 4;12(6):e0011724. doi: 10.1128/spectrum.00117-24. Epub 2024 Apr 30.

QAlign: aligning nanopore reads accurately using current-level modeling.QAlign：使用电流水平建模准确对齐纳米孔读数。

Bioinformatics. 2021 May 5;37(5):625-633. doi: 10.1093/bioinformatics/btaa875.

引用本文的文献

cONcat: Computational reconstruction of concatenated fragments from long Oxford Nanopore reads.cONcat：从长牛津纳米孔测序读段中进行串联片段的计算重建。

PLoS One. 2025 Jul 24;20(7):e0321246. doi: 10.1371/journal.pone.0321246. eCollection 2025.

De novo clustering of large long-read transcriptome datasets with isONclust3.使用isONclust3对大型长读长转录组数据集进行从头聚类。

Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf207.

Transcriptomics in the era of long-read sequencing.长读长测序时代的转录组学

Nat Rev Genet. 2025 Mar 28. doi: 10.1038/s41576-025-00828-z.

Notable challenges posed by long-read sequencing for the study of transcriptional diversity and genome annotation.长读长测序在转录多样性研究和基因组注释方面带来的显著挑战。

Genome Res. 2025 Apr 14;35(4):583-592. doi: 10.1101/gr.279865.124.

Data reuse in agricultural genomics research: challenges and recommendations.农业基因组学研究中的数据重用：挑战与建议。

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae106.

Trans2express - de novo transcriptome assembly pipeline optimized for gene expression analysis.Trans2express - 针对基因表达分析优化的从头转录组组装流程。

Plant Methods. 2024 Aug 17;20(1):128. doi: 10.1186/s13007-024-01255-7.

Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data.长读测序数据中 mRNA 异构体检测方法的综合评估。

Nat Commun. 2024 May 10;15(1):3972. doi: 10.1038/s41467-024-48117-3.

Merging short and stranded long reads improves transcript assembly.短读和单链长读的合并提高了转录本组装。

PLoS Comput Biol. 2023 Oct 26;19(10):e1011576. doi: 10.1371/journal.pcbi.1011576. eCollection 2023 Oct.

本文引用的文献

Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.系统评估长读 RNA-seq 方法在转录本鉴定和定量中的应用。

Nat Methods. 2024 Jul;21(7):1349-1363. doi: 10.1038/s41592-024-02298-3. Epub 2024 Jun 7.

Identifying and quantifying isoforms from accurate full-length transcriptome sequencing reads with Mandalorion.使用 Mandalorion 从准确的全长转录组测序读段中鉴定和量化异构体。

Genome Biol. 2023 Jul 17;24(1):167. doi: 10.1186/s13059-023-02999-6.

Context-aware transcript quantification from long-read RNA-seq data with Bambu.使用 Bambu 从长读 RNA-seq 数据中进行上下文感知的转录本定量。

Nat Methods. 2023 Aug;20(8):1187-1195. doi: 10.1038/s41592-023-01908-w. Epub 2023 Jun 12.

Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2.无参考组装长读转录组测序数据的 RNA-Bloom2 方法。

Nat Commun. 2023 May 22;14(1):2940. doi: 10.1038/s41467-023-38553-y.

Accurate isoform discovery with IsoQuant using long reads.利用长读长 IsoQuant 进行准确的异构体发现。

Nat Biotechnol. 2023 Jul;41(7):915-918. doi: 10.1038/s41587-022-01565-y. Epub 2023 Jan 2.

Strobealign: flexible seed size enables ultra-fast and accurate read alignment.Strobealign：灵活的种子大小可实现超快速和准确的读取对齐。

Genome Biol. 2022 Dec 15;23(1):260. doi: 10.1186/s13059-022-02831-7.

Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms using long-read sequencing.弗雷迪：使用长读测序进行注释独立的转录组可变剪接异构体的检测和发现。

Nucleic Acids Res. 2023 Jan 25;51(2):e11. doi: 10.1093/nar/gkac1112.

RATTLE: reference-free reconstruction and quantification of transcriptomes from Nanopore sequencing.RATTLE：基于纳米孔测序的无参转录组重构和定量分析。

Genome Biol. 2022 Jul 8;23(1):153. doi: 10.1186/s13059-022-02715-w.

Effective sequence similarity detection with strobemers.利用频闪体进行有效的序列相似性检测。

Genome Res. 2021 Nov;31(11):2080-2094. doi: 10.1101/gr.275648.121. Epub 2021 Oct 19.

Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer.最小化空间 de Bruijn 图：在个人计算机上数分钟内完成长读段的全基因组组装。

Cell Syst. 2021 Oct 20;12(10):958-968.e6. doi: 10.1016/j.cels.2021.08.009. Epub 2021 Sep 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

isONform：一种无需参考基因组的转录组重构方法，来自牛津纳米孔技术。

isONform: reference-free transcriptome reconstruction from Oxford Nanopore data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献