• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。

Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.

作者信息

Jousheghani Zahra Zare, Patro Rob

机构信息

Department of Electrical and Computer Engineering, University of Maryland, College Park, 20742, Maryland, USA.

Department of Computer Science, University of Maryland, College Park, 20742, Maryland, USA.

出版信息

bioRxiv. 2024 Mar 1:2024.02.28.582591. doi: 10.1101/2024.02.28.582591.

DOI:10.1101/2024.02.28.582591
PMID:38464200
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10925290/
Abstract

MOTIVATION

Long read sequencing technology is becoming an increasingly indispensable tool in genomic and transcriptomic analysis. In transcriptomics in particular, long reads offer the possibility of sequencing full-length isoforms, which can vastly simplify the identification of novel transcripts and transcript quantification. However, despite this promise, the focus of much long read method development to date has been on transcript identification, with comparatively little attention paid to quantification. Yet, due to differences in the underlying protocols and technologies, lower throughput (i.e. fewer reads sequenced per sample compared to short read technologies), as well as technical artifacts, long read quantification remains a challenge, motivating the continued development and assessment of quantification methods tailored to this increasingly prevalent type of data.

RESULTS

We introduce a new method and software tool for long read transcript quantification called oarfish. Our model incorporates a novel and innovative coverage score, which affects the conditional probability of fragment assignment in the underlying probabilistic model. We demonstrate that by accounting for this coverage information, oarfish is able to produce more accurate quantification estimates than existing long read quantification methods, particularly when one considers the primary isoforms present in a particular cell line or tissue type.

AVAILABILITY AND IMPLEMENTATION

Oarfish is implemented in the Rust programming language, and is made available as free and open-source software under the BSD 3-clause license. The source code is available at https://www.github.com/COMBINE-lab/oarfish.

摘要

动机

长读长测序技术正日益成为基因组和转录组分析中不可或缺的工具。特别是在转录组学中,长读长为全长异构体测序提供了可能,这可以极大地简化新转录本的鉴定和转录本定量。然而,尽管有此前景,但迄今为止,许多长读长方法的开发重点一直是转录本鉴定,而对定量的关注相对较少。然而,由于底层协议和技术的差异、较低的通量(即与短读长技术相比,每个样本测序的读长较少)以及技术假象,长读长定量仍然是一个挑战,这促使人们继续开发和评估针对这种日益普遍的数据类型的定量方法。

结果

我们介绍了一种名为oarfish的用于长读长转录本定量的新方法和软件工具。我们的模型纳入了一种新颖且创新的覆盖分数,它会影响底层概率模型中片段分配的条件概率。我们证明,通过考虑这种覆盖信息,oarfish能够比现有的长读长定量方法产生更准确的定量估计,特别是当考虑特定细胞系或组织类型中存在的主要异构体时。

可用性和实现方式

oarfish用Rust编程语言实现,并根据BSD 3条款许可作为免费和开源软件提供。源代码可在https://www.github.com/COMBINE-lab/oarfish获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/8da6f044e0bb/nihpp-2024.02.28.582591v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/4a125031ae8f/nihpp-2024.02.28.582591v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/d65f4469efe7/nihpp-2024.02.28.582591v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/89076463e882/nihpp-2024.02.28.582591v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/17b0daa0a5ed/nihpp-2024.02.28.582591v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/7666b69fb278/nihpp-2024.02.28.582591v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/c772177b03ee/nihpp-2024.02.28.582591v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/8da6f044e0bb/nihpp-2024.02.28.582591v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/4a125031ae8f/nihpp-2024.02.28.582591v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/d65f4469efe7/nihpp-2024.02.28.582591v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/89076463e882/nihpp-2024.02.28.582591v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/17b0daa0a5ed/nihpp-2024.02.28.582591v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/7666b69fb278/nihpp-2024.02.28.582591v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/c772177b03ee/nihpp-2024.02.28.582591v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c40/10925290/8da6f044e0bb/nihpp-2024.02.28.582591v1-f0007.jpg

相似文献

1
Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
bioRxiv. 2024 Mar 1:2024.02.28.582591. doi: 10.1101/2024.02.28.582591.
2
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i304-i313. doi: 10.1093/bioinformatics/btaf240.
3
SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification.SQANTI:用于全长转录组鉴定和定量的长读转录序列的广泛特征化,以进行质量控制。
Genome Res. 2018 Mar 1;28(3):396-411. doi: 10.1101/gr.222976.117.
4
Short-Term Memory Impairment短期记忆障碍
5
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
6
Alevin-fry-atac enables rapid and memory frugal mapping of single-cell ATAC-seq data using virtual colors for accurate genomic pseudoalignment.Alevin-fry-atac可使用虚拟颜色实现单细胞ATAC-seq数据的快速且节省内存的映射,以进行准确的基因组伪比对。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i237-i245. doi: 10.1093/bioinformatics/btaf234.
7
Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理:一项网络荟萃分析。
Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.
8
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
9
SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.SAKit:一种用于鉴定由大尺度和小尺度变异事件产生的新型蛋白质的一体化分析管道。
J Bioinform Comput Biol. 2024 Oct;22(5):2450022. doi: 10.1142/S0219720024500227. Epub 2024 Oct 1.
10
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.

引用本文的文献

1
Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner.通过使用TranSigner准确分配长RNA测序读数来增强转录组表达定量。
Genome Biol. 2025 Aug 28;26(1):257. doi: 10.1186/s13059-025-03723-2.

本文引用的文献

1
Dividing out quantification uncertainty allows efficient assessment of differential transcript expression with edgeR.分离子代变化的定量不确定性可利用 edgeR 有效地评估差异转录表达。
Nucleic Acids Res. 2024 Feb 9;52(3):e13. doi: 10.1093/nar/gkad1167.
2
Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures.基于计算机模拟混合物对长读 RNA 测序分析工具进行基准测试。
Nat Methods. 2023 Nov;20(11):1810-1821. doi: 10.1038/s41592-023-02026-3. Epub 2023 Oct 2.
3
TEQUILA-seq: a versatile and low-cost method for targeted long-read RNA sequencing.
龙舌兰测序:一种用于靶向长读 RNA 测序的多功能且低成本的方法。
Nat Commun. 2023 Aug 8;14(1):4760. doi: 10.1038/s41467-023-40083-6.
4
Context-aware transcript quantification from long-read RNA-seq data with Bambu.使用 Bambu 从长读 RNA-seq 数据中进行上下文感知的转录本定量。
Nat Methods. 2023 Aug;20(8):1187-1195. doi: 10.1038/s41592-023-01908-w. Epub 2023 Jun 12.
5
High-throughput RNA isoform sequencing using programmed cDNA concatenation.使用可编程 cDNA 连接的高通量 RNA 异构体测序。
Nat Biotechnol. 2024 Apr;42(4):582-586. doi: 10.1038/s41587-023-01815-7. Epub 2023 Jun 8.
6
ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data.ESPRESSO:从易错的长读 RNA-seq 数据中稳健地发现和定量转录本异构体。
Sci Adv. 2023 Jan 20;9(3):eabq5072. doi: 10.1126/sciadv.abq5072.
7
Direct Sequencing of RNA and RNA Modification Identification Using Nanopore.使用纳米孔直接测序 RNA 和鉴定 RNA 修饰。
Methods Mol Biol. 2022;2477:71-77. doi: 10.1007/978-1-0716-2257-5_5.
8
The complete sequence of a human genome.人类基因组的完整序列。
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
9
Accurate expression quantification from nanopore direct RNA sequencing with NanoCount.利用 NanoCount 从纳米孔直接 RNA 测序中进行准确的表达定量。
Nucleic Acids Res. 2022 Feb 28;50(4):e19. doi: 10.1093/nar/gkab1129.
10
Nanopore sequencing technology, bioinformatics and applications.纳米孔测序技术、生物信息学及其应用。
Nat Biotechnol. 2021 Nov;39(11):1348-1365. doi: 10.1038/s41587-021-01108-x. Epub 2021 Nov 8.