• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从多个样本RNA测序数据中准确推断异构体

Accurate inference of isoforms from multiple sample RNA-Seq data.

作者信息

Tasnim Masruba, Ma Shining, Yang Ei-Wen, Jiang Tao, Li Wei

出版信息

BMC Genomics. 2015;16 Suppl 2(Suppl 2):S15. doi: 10.1186/1471-2164-16-S2-S15. Epub 2015 Jan 21.

DOI:10.1186/1471-2164-16-S2-S15
PMID:25708199
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4331715/
Abstract

BACKGROUND

RNA-Seq based transcriptome assembly has become a fundamental technique for studying expressed mRNAs (i.e., transcripts or isoforms) in a cell using high-throughput sequencing technologies, and is serving as a basis to analyze the structural and quantitative differences of expressed isoforms between samples. However, the current transcriptome assembly algorithms are not specifically designed to handle large amounts of errors that are inherent in real RNA-Seq datasets, especially those involving multiple samples, making downstream differential analysis applications difficult. On the other hand, multiple sample RNA-Seq datasets may provide more information than single sample datasets that can be utilized to improve the performance of transcriptome assembly and abundance estimation, but such information remains overlooked by the existing assembly tools.

RESULTS

We formulate a computational framework of transcriptome assembly that is capable of handling noisy RNA-Seq reads and multiple sample RNA-Seq datasets efficiently. We show that finding an optimal solution under this framework is an NP-hard problem. Instead, we develop an efficient heuristic algorithm, called Iterative Shortest Path (ISP), based on linear programming (LP) and integer linear programming (ILP). Our preliminary experimental results on both simulated and real datasets and comparison with the existing assembly tools demonstrate that (i) the ISP algorithm is able to assemble transcriptomes with a greatly increased precision while keeping the same level of sensitivity, especially when many samples are involved, and (ii) its assembly results help improve downstream differential analysis. The source code of ISP is freely available at http://alumni.cs.ucr.edu/~liw/isp.html.

摘要

背景

基于RNA测序的转录组组装已成为利用高通量测序技术研究细胞中表达的mRNA(即转录本或异构体)的一项基础技术,并为分析样本间表达异构体的结构和数量差异提供了依据。然而,当前的转录组组装算法并非专门设计用于处理真实RNA测序数据集中固有的大量错误,尤其是那些涉及多个样本的数据集,这使得下游差异分析应用变得困难。另一方面,多样本RNA测序数据集可能比单样本数据集提供更多信息,可用于提高转录组组装和丰度估计的性能,但现有组装工具却忽略了这些信息。

结果

我们制定了一个转录组组装的计算框架,该框架能够有效处理有噪声的RNA测序读段和多样本RNA测序数据集。我们表明,在此框架下找到最优解是一个NP难问题。相反,我们基于线性规划(LP)和整数线性规划(ILP)开发了一种高效的启发式算法,称为迭代最短路径(ISP)。我们在模拟数据集和真实数据集上的初步实验结果以及与现有组装工具的比较表明:(i)ISP算法能够在保持相同灵敏度水平的同时,以显著提高的精度组装转录组,尤其是在涉及多个样本时;(ii)其组装结果有助于改进下游差异分析。ISP 的源代码可在http://alumni.cs.ucr.edu/~liw/isp.html免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/7045533d366d/1471-2164-16-S2-S15-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/1bdccedfaf38/1471-2164-16-S2-S15-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/96a5fa80f7a1/1471-2164-16-S2-S15-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/576f70e60ebe/1471-2164-16-S2-S15-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/180e19567f17/1471-2164-16-S2-S15-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/7045533d366d/1471-2164-16-S2-S15-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/1bdccedfaf38/1471-2164-16-S2-S15-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/96a5fa80f7a1/1471-2164-16-S2-S15-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/576f70e60ebe/1471-2164-16-S2-S15-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/180e19567f17/1471-2164-16-S2-S15-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94d5/4331715/7045533d366d/1471-2164-16-S2-S15-5.jpg

相似文献

1
Accurate inference of isoforms from multiple sample RNA-Seq data.从多个样本RNA测序数据中准确推断异构体
BMC Genomics. 2015;16 Suppl 2(Suppl 2):S15. doi: 10.1186/1471-2164-16-S2-S15. Epub 2015 Jan 21.
2
Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads.从偏向性 RNA-Seq 读段进行转录组组装和异构体表达水平估计。
Bioinformatics. 2012 Nov 15;28(22):2914-21. doi: 10.1093/bioinformatics/bts559. Epub 2012 Oct 11.
3
Inference of isoforms from short sequence reads.从短序列读取中推断异构体
J Comput Biol. 2011 Mar;18(3):305-21. doi: 10.1089/cmb.2010.0243.
4
Design of RNA splicing analysis null models for post hoc filtering of Drosophila head RNA-Seq data with the splicing analysis kit (Spanki).利用剪接分析试剂盒(Spanki)对果蝇头部 RNA-Seq 数据进行事后过滤的 RNA 剪接分析零模型设计。
BMC Bioinformatics. 2013 Nov 9;14:320. doi: 10.1186/1471-2105-14-320.
5
SSP: an interval integer linear programming for de novo transcriptome assembly and isoform discovery of RNA-seq reads.SSP:一种用于 RNA-seq reads 从头转录组组装和异构体发现的区间整数线性规划方法。
Genomics. 2013 Nov-Dec;102(5-6):507-14. doi: 10.1016/j.ygeno.2013.10.003. Epub 2013 Oct 23.
6
A robust method for transcript quantification with RNA-seq data.一种利用RNA测序数据进行转录本定量的可靠方法。
J Comput Biol. 2013 Mar;20(3):167-87. doi: 10.1089/cmb.2012.0230.
7
QuaPra: Efficient transcript assembly and quantification using quadratic programming with Apriori algorithm.QuaPra:使用 Apriori 算法的二次规划进行高效转录本组装和定量。
Sci China Life Sci. 2019 Jul;62(7):937-946. doi: 10.1007/s11427-018-9433-3. Epub 2019 May 22.
8
Freddie: annotation-independent detection and discovery of transcriptomic alternative splicing isoforms using long-read sequencing.弗雷迪:使用长读测序进行注释独立的转录组可变剪接异构体的检测和发现。
Nucleic Acids Res. 2023 Jan 25;51(2):e11. doi: 10.1093/nar/gkac1112.
9
FDM: a graph-based statistical method to detect differential transcription using RNA-seq data.FDM:一种基于图的统计方法,用于检测使用 RNA-seq 数据的差异转录。
Bioinformatics. 2011 Oct 1;27(19):2633-40. doi: 10.1093/bioinformatics/btr458. Epub 2011 Aug 8.
10
MultiTrans: An Algorithm for Path Extraction Through Mixed Integer Linear Programming for Transcriptome Assembly.MultiTrans:一种通过混合整数线性规划进行转录组组装的路径提取算法。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):48-56. doi: 10.1109/TCBB.2021.3083277. Epub 2022 Feb 3.

引用本文的文献

1
Transcriptome assembly at single-cell resolution with Beaver.使用海狸实现单细胞分辨率的转录组组装。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i323-i331. doi: 10.1093/bioinformatics/btaf236.
2
Pig jejunal single-cell RNA landscapes revealing breed-specific immunology differentiation at various domestication stages.猪空肠单细胞RNA图谱揭示了不同驯化阶段的品种特异性免疫分化。
Front Immunol. 2025 Feb 28;16:1530214. doi: 10.3389/fimmu.2025.1530214. eCollection 2025.
3
Accurate assembly of multiple RNA-seq samples with Aletsch.

本文引用的文献

1
MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples.MITIE:在多个样本中基于 RNA-Seq 的同时转录本鉴定和定量。
Bioinformatics. 2013 Oct 15;29(20):2529-38. doi: 10.1093/bioinformatics/btt442. Epub 2013 Aug 25.
2
Differential analysis of gene regulation at transcript resolution with RNA-seq.基于 RNA-seq 的转录分辨率下基因调控的差异分析。
Nat Biotechnol. 2013 Jan;31(1):46-53. doi: 10.1038/nbt.2450. Epub 2012 Dec 9.
3
Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads.
利用 Aletsch 对多个 RNA-seq 样本进行精确组装。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i307-i317. doi: 10.1093/bioinformatics/btae215.
4
The Impact of Blood Sample Processing on Ribonucleic Acid (RNA) Sequencing.血样处理对核糖核酸(RNA)测序的影响
Genes (Basel). 2024 Apr 17;15(4):502. doi: 10.3390/genes15040502.
5
IntAPT: integrated assembly of phenotype-specific transcripts from multiple RNA-seq profiles.IntAPT:从多个 RNA-seq 谱中整合表型特异转录本的组装。
Bioinformatics. 2021 May 5;37(5):650-658. doi: 10.1093/bioinformatics/btaa852.
6
Transcriptomic response to soybean meal-based diets as the first formulated feed in juvenile yellow perch (Perca flavescens).转录组对以豆粕为基础的饲料的反应,作为幼小黄鲈(Perca flavescens)的第一种配方饲料。
Sci Rep. 2020 Mar 4;10(1):3998. doi: 10.1038/s41598-020-59691-z.
7
A multi-sample approach increases the accuracy of transcript assembly.多样本方法可提高转录本组装的准确性。
Nat Commun. 2019 Nov 1;10(1):5000. doi: 10.1038/s41467-019-12990-0.
8
Bayesian nonparametric discovery of isoforms and individual specific quantification.贝叶斯非参数发现同种型和个体特异性定量。
Nat Commun. 2018 Apr 27;9(1):1681. doi: 10.1038/s41467-018-03402-w.
9
Draft de novo transcriptome assembly and proteome characterization of the electric lobe of Tetronarce californica: a molecular tool for the study of cholinergic neurotransmission in the electric organ.加利福尼亚双斑鳐电叶的从头转录组组装草案和蛋白质组特征:用于研究电器官中胆碱能神经传递的分子工具。
BMC Genomics. 2017 Aug 14;18(1):611. doi: 10.1186/s12864-017-3890-4.
10
The determinants of alternative RNA splicing in human cells.人类细胞中选择性 RNA 剪接的决定因素。
Mol Genet Genomics. 2017 Dec;292(6):1175-1195. doi: 10.1007/s00438-017-1350-0. Epub 2017 Jul 13.
从偏向性 RNA-Seq 读段进行转录组组装和异构体表达水平估计。
Bioinformatics. 2012 Nov 15;28(22):2914-21. doi: 10.1093/bioinformatics/bts559. Epub 2012 Oct 11.
4
Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer.乳腺癌中全球 DNA 低甲基化与抑制性染色质结构域形成和基因沉默相关。
Genome Res. 2012 Feb;22(2):246-58. doi: 10.1101/gr.125872.111. Epub 2011 Dec 7.
5
Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation.基于下一代 mRNA 测序(RNA-Seq)数据的稀疏线性建模用于发现异构体和丰度估计。
Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):19867-72. doi: 10.1073/pnas.1113972108. Epub 2011 Dec 1.
6
NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy.NCBI 参考序列(RefSeq):现状、新特性和基因组注释政策。
Nucleic Acids Res. 2012 Jan;40(Database issue):D130-5. doi: 10.1093/nar/gkr1079. Epub 2011 Nov 24.
7
Ascaris suum draft genome.猪蛔虫草图基因组。
Nature. 2011 Oct 26;479(7374):529-33. doi: 10.1038/nature10553.
8
IsoLasso: a LASSO regression approach to RNA-Seq based transcriptome assembly.IsoLasso:一种基于RNA测序的转录组组装的套索回归方法。
J Comput Biol. 2011 Nov;18(11):1693-707. doi: 10.1089/cmb.2011.0171. Epub 2011 Sep 27.
9
Next-generation transcriptome assembly.下一代转录组组装。
Nat Rev Genet. 2011 Sep 7;12(10):671-82. doi: 10.1038/nrg3068.
10
Integrated genomic analyses of ovarian carcinoma.卵巢癌的综合基因组分析。
Nature. 2011 Jun 29;474(7353):609-15. doi: 10.1038/nature10166.