从具有生物学变异的 RNA-seq 数据中鉴定差异表达的转录本。

Identifying differentially expressed transcripts from RNA-seq data with biological variation.

机构信息

School of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK.

出版信息

Bioinformatics. 2012 Jul 1;28(13):1721-8. doi: 10.1093/bioinformatics/bts260. Epub 2012 May 3.

DOI:10.1093/bioinformatics/bts260

PMID:22563066

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3381971/

Abstract

MOTIVATION

High-throughput sequencing enables expression analysis at the level of individual transcripts. The analysis of transcriptome expression levels and differential expression (DE) estimation requires a probabilistic approach to properly account for ambiguity caused by shared exons and finite read sampling as well as the intrinsic biological variance of transcript expression.

RESULTS

We present Bayesian inference of transcripts from sequencing data (BitSeq), a Bayesian approach for estimation of transcript expression level from RNA-seq experiments. Inferred relative expression is represented by Markov chain Monte Carlo samples from the posterior probability distribution of a generative model of the read data. We propose a novel method for DE analysis across replicates which propagates uncertainty from the sample-level model while modelling biological variance using an expression-level-dependent prior. We demonstrate the advantages of our method using simulated data as well as an RNA-seq dataset with technical and biological replication for both studied conditions.

AVAILABILITY

The implementation of the transcriptome expression estimation and differential expression analysis, BitSeq, has been written in C++ and Python. The software is available online from http://code.google.com/p/bitseq/, version 0.4 was used for generating results presented in this article.

摘要

动机

高通量测序能够在单个转录本水平上进行表达分析。转录组表达水平的分析和差异表达（DE）估计需要一种概率方法，以正确考虑共享外显子和有限的读取采样以及转录本表达的内在生物学变异性所引起的歧义。

结果

我们提出了从测序数据推断转录本（BitSeq）的方法，这是一种用于从 RNA-seq 实验中估计转录本表达水平的贝叶斯方法。从生成读取数据的模型的后验概率分布中，通过马尔可夫链蒙特卡罗样本表示推断的相对表达。我们提出了一种用于跨重复进行 DE 分析的新方法，该方法从样本级模型传播不确定性，同时使用基于表达水平的先验来模拟生物学变异性。我们使用模拟数据以及具有技术和生物学重复的两种研究条件的 RNA-seq 数据集来证明我们方法的优势。

可用性

转录本表达估计和差异表达分析的实现，BitSeq，是用 C++ 和 Python 编写的。该软件可从 http://code.google.com/p/bitseq/ 在线获得，本文中使用的版本是 0.4。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/273a/3381971/8444f31a885b/bts260f1.jpg

相似文献

Identifying differentially expressed transcripts from RNA-seq data with biological variation.从具有生物学变异的 RNA-seq 数据中鉴定差异表达的转录本。

Bioinformatics. 2012 Jul 1;28(13):1721-8. doi: 10.1093/bioinformatics/bts260. Epub 2012 May 3.

Fast and accurate approximate inference of transcript expression from RNA-seq data.从RNA测序数据中快速准确地进行转录本表达的近似推断。

Bioinformatics. 2015 Dec 15;31(24):3881-9. doi: 10.1093/bioinformatics/btv483. Epub 2015 Aug 26.

NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data.NPEBseq：一种基于非参数经验贝叶斯的 RNA-seq 数据差异表达分析方法。

BMC Bioinformatics. 2013 Aug 27;14:262. doi: 10.1186/1471-2105-14-262.

BADGE: a novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data.标记：一种用于 RNA-Seq 数据精确丰度定量和差异分析的新型贝叶斯模型。

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S6. doi: 10.1186/1471-2105-15-S9-S6. Epub 2014 Sep 10.

Detecting Multivariate Gene Interactions in RNA-Seq Data Using Optimal Bayesian Classification.基于最优贝叶斯分类的 RNA-Seq 数据中多变量基因交互作用检测。

IEEE/ACM Trans Comput Biol Bioinform. 2018 Mar-Apr;15(2):484-493. doi: 10.1109/TCBB.2015.2485223. Epub 2015 Oct 1.

EBSeq-HMM: a Bayesian approach for identifying gene-expression changes in ordered RNA-seq experiments.EBSeq-HMM：一种用于在有序RNA测序实验中识别基因表达变化的贝叶斯方法。

Bioinformatics. 2015 Aug 15;31(16):2614-22. doi: 10.1093/bioinformatics/btv193. Epub 2015 Apr 5.

SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data.SparseIso：一种从 RNA-seq 数据中识别选择性剪接异构体的新型贝叶斯方法。

Bioinformatics. 2018 Jan 1;34(1):56-63. doi: 10.1093/bioinformatics/btx557.

DEXUS: identifying differential expression in RNA-Seq studies with unknown conditions.DEXUS：在未知条件的 RNA-Seq 研究中识别差异表达。

Nucleic Acids Res. 2013 Nov;41(21):e198. doi: 10.1093/nar/gkt834. Epub 2013 Sep 17.

A mixture model for expression deconvolution from RNA-seq in heterogeneous tissues.一种用于异质组织中 RNA-seq 表达解卷积的混合模型。

BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S11. doi: 10.1186/1471-2105-14-S5-S11. Epub 2013 Apr 10.

DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates.DEIsoM：一种基于层次贝叶斯模型的方法，用于使用生物学重复样本识别差异表达的异构体。

Bioinformatics. 2017 Oct 1;33(19):3018-3027. doi: 10.1093/bioinformatics/btx357.

引用本文的文献

Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼：增强的概率模型可提高长读长转录组定量的准确性。

Bioinformatics. 2025 Jul 1;41(Supplement_1):i304-i313. doi: 10.1093/bioinformatics/btaf240.

Microbiome Single Cell Atlases Generated with a Commercial Instrument.使用商用仪器生成的微生物群落单细胞图谱。

Adv Sci (Weinh). 2025 Jun 3:e2409338. doi: 10.1002/advs.202409338.

Genome-wide identification and characterization of transcription factors involved in defense responses against Sclerotinia sclerotiorum in Brassica juncea.芥菜中参与抗核盘菌防御反应的转录因子的全基因组鉴定与表征

Sci Rep. 2025 Feb 5;15(1):4341. doi: 10.1038/s41598-025-89054-5.

Error modelled gene expression analysis (EMOGEA) provides a superior overview of time course RNA-seq measurements and low count gene expression.基于误差模型的基因表达分析（EMOGEA）提供了对时间序列 RNA-seq 测量和低计数基因表达的更全面的概述。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae233.

Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼：增强的概率模型可提高长读长转录组定量的准确性。

bioRxiv. 2024 Mar 1:2024.02.28.582591. doi: 10.1101/2024.02.28.582591.

Microbiome single cell atlases generated with a commercial instrument.使用商业仪器生成的微生物群落单细胞图谱。

Res Sq. 2023 Sep 14:rs.3.rs-3253785. doi: 10.21203/rs.3.rs-3253785/v1.

TreeTerminus -creating transcript trees using inferential replicate counts.TreeTerminus - 使用推断重复计数创建转录本树。

iScience. 2023 May 25;26(6):106961. doi: 10.1016/j.isci.2023.106961. eCollection 2023 Jun 16.

RNA-seq data science: From raw data to effective interpretation.RNA测序数据科学：从原始数据到有效解读

Front Genet. 2023 Mar 13;14:997383. doi: 10.3389/fgene.2023.997383. eCollection 2023.

DELongSeq for efficient detection of differential isoform expression from long-read RNA-seq data.DELongSeq用于从长读长RNA测序数据中高效检测差异异构体表达。

NAR Genom Bioinform. 2023 Mar 3;5(1):lqad019. doi: 10.1093/nargab/lqad019. eCollection 2023 Mar.

Temporal progress of gene expression analysis with RNA-Seq data: A review on the relationship between computational methods.基于RNA测序数据的基因表达分析的时间进展：计算方法之间关系的综述

Comput Struct Biotechnol J. 2022 Dec 1;21:86-98. doi: 10.1016/j.csbj.2022.11.051. eCollection 2023.

本文引用的文献

Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.随机松弛，吉布斯分布，以及贝叶斯图像恢复。

IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41. doi: 10.1109/tpami.1984.4767596.

RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome.RSEM：有或无参考基因组的 RNA-Seq 数据的准确转录本定量。

BMC Bioinformatics. 2011 Aug 4;12:323. doi: 10.1186/1471-2105-12-323.

Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.RNA-Seq 定量转录表达谱中精度的特征描述和改进。

Bioinformatics. 2011 Jul 1;27(13):i383-91. doi: 10.1093/bioinformatics/btr247.

Improving RNA-Seq expression estimates by correcting for fragment bias.通过纠正片段偏倚来提高 RNA-Seq 表达估计。

Genome Biol. 2011;12(3):R22. doi: 10.1186/gb-2011-12-3-r22. Epub 2011 Mar 16.

Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads.利用多映射 RNA-seq reads 进行单倍型和异构体特异性表达估计。

Genome Biol. 2011;12(2):R13. doi: 10.1186/gb-2011-12-2-r13. Epub 2011 Feb 10.

The developmental transcriptome of Drosophila melanogaster.黑腹果蝇的发育转录组。

Nature. 2011 Mar 24;471(7339):473-9. doi: 10.1038/nature09715. Epub 2010 Dec 22.

From RNA-seq reads to differential expression results.从 RNA-seq 读取到差异表达结果。

Genome Biol. 2010;11(12):220. doi: 10.1186/gb-2010-11-12-220. Epub 2010 Dec 22.

Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq.使用非均匀读分布模型提高 RNA-Seq 中异构体表达推断。

Bioinformatics. 2011 Feb 15;27(4):502-8. doi: 10.1093/bioinformatics/btq696. Epub 2010 Dec 17.

Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq.异构体丰度推断能更准确地估计RNA测序中的基因表达水平。

J Bioinform Comput Biol. 2010 Dec;8 Suppl 1:177-92. doi: 10.1142/s0219720010005178.

Analysis and design of RNA sequencing experiments for identifying isoform regulation.RNA 测序实验分析与设计，用于鉴定异构体调控

Nat Methods. 2010 Dec;7(12):1009-15. doi: 10.1038/nmeth.1528. Epub 2010 Nov 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从具有生物学变异的 RNA-seq 数据中鉴定差异表达的转录本。

Identifying differentially expressed transcripts from RNA-seq data with biological variation.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献