• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于检测RNA测序数据中差异表达基因的统计方法。

Statistical methods on detecting differentially expressed genes for RNA-seq data.

作者信息

Chen Zhongxue, Liu Jianzhong, Ng Hon Keung Tony, Nadarajah Saralees, Kaufman Howard L, Yang Jack Y, Deng Youping

机构信息

Biostatistics Epidemiology Research Design Core, Center for Clinical and Translational Sciences, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

出版信息

BMC Syst Biol. 2011;5 Suppl 3(Suppl 3):S1. doi: 10.1186/1752-0509-5-S3-S1. Epub 2011 Dec 23.

DOI:10.1186/1752-0509-5-S3-S1
PMID:22784615
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3287564/
Abstract

BACKGROUND

For RNA-seq data, the aggregated counts of the short reads from the same gene is used to approximate the gene expression level. The count data can be modelled as samples from Poisson distributions with possible different parameters. To detect differentially expressed genes under two situations, statistical methods for detecting the difference of two Poisson means are used. When the expression level of a gene is low, i.e., the number of count is small, it is usually more difficult to detect the mean differences, and therefore statistical methods which are more powerful for low expression level are particularly desirable. In statistical literature, several methods have been proposed to compare two Poisson means (rates). In this paper, we compare these methods by using simulated and real RNA-seq data.

RESULTS

Through simulation study and real data analysis, we find that the Wald test with the data being log-transformed is more powerful than other methods, including the likelihood ratio test, which has similar power as the variance stabilizing transformation test; both are more powerful than the conditional exact test and Fisher exact test.

CONCLUSIONS

When the count data in RNA-seq can be reasonably modelled as Poisson distribution, the Wald-Log test is more powerful and should be used to detect the differentially expressed genes.

摘要

背景

对于RNA测序数据,来自同一基因的短读段的汇总计数用于近似基因表达水平。计数数据可建模为来自具有可能不同参数的泊松分布的样本。为了检测两种情况下的差异表达基因,使用检测两个泊松均值差异的统计方法。当基因的表达水平较低时,即计数数量较少时,通常更难检测到均值差异,因此对于低表达水平更具效力的统计方法尤为可取。在统计文献中,已经提出了几种比较两个泊松均值(比率)的方法。在本文中,我们通过使用模拟和真实的RNA测序数据来比较这些方法。

结果

通过模拟研究和实际数据分析,我们发现对数据进行对数转换后的Wald检验比其他方法更具效力,包括似然比检验,其效力与方差稳定变换检验相似;这两种检验都比条件精确检验和Fisher精确检验更具效力。

结论

当RNA测序中的计数数据可以合理地建模为泊松分布时,Wald-Log检验更具效力,应使用它来检测差异表达基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/e212cf09aa7d/1752-0509-5-S3-S1-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/bbf00d9040cc/1752-0509-5-S3-S1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/3cc353b3fe22/1752-0509-5-S3-S1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/6d1295b9f240/1752-0509-5-S3-S1-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/8778e13b74ac/1752-0509-5-S3-S1-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/e109282c72b6/1752-0509-5-S3-S1-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/e212cf09aa7d/1752-0509-5-S3-S1-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/bbf00d9040cc/1752-0509-5-S3-S1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/3cc353b3fe22/1752-0509-5-S3-S1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/6d1295b9f240/1752-0509-5-S3-S1-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/8778e13b74ac/1752-0509-5-S3-S1-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/e109282c72b6/1752-0509-5-S3-S1-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf4e/3287564/e212cf09aa7d/1752-0509-5-S3-S1-6.jpg

相似文献

1
Statistical methods on detecting differentially expressed genes for RNA-seq data.用于检测RNA测序数据中差异表达基因的统计方法。
BMC Syst Biol. 2011;5 Suppl 3(Suppl 3):S1. doi: 10.1186/1752-0509-5-S3-S1. Epub 2011 Dec 23.
2
A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments.一种灵活的计数数据模型,可适用于广泛复制的 RNA-seq 实验所产生的广泛多样化的表达谱。
BMC Bioinformatics. 2013 Aug 21;14:254. doi: 10.1186/1471-2105-14-254.
3
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads.通过纳入非外显子映射读数对RNA测序数据进行差异表达分析。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S14. doi: 10.1186/1471-2164-16-S7-S14. Epub 2015 Jun 11.
4
A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data.基于 RNA-seq 数据的差异表达基因检测的统计学方法比较。
Am J Bot. 2012 Feb;99(2):248-56. doi: 10.3732/ajb.1100340. Epub 2012 Jan 20.
5
A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.一种使用RNA测序数据构建基因共变网络的泊松对数正态模型。
J Comput Biol. 2017 Jul;24(7):721-731. doi: 10.1089/cmb.2017.0053. Epub 2017 May 30.
6
rSeqDiff: detecting differential isoform expression from RNA-Seq data using hierarchical likelihood ratio test.rSeqDiff:使用层次似然比检验从 RNA-Seq 数据中检测差异异构体表达。
PLoS One. 2013 Nov 18;8(11):e79448. doi: 10.1371/journal.pone.0079448. eCollection 2013.
7
LFCseq: a nonparametric approach for differential expression analysis of RNA-seq data.LFCseq:一种用于RNA测序数据差异表达分析的非参数方法。
BMC Genomics. 2014;15 Suppl 10(Suppl 10):S7. doi: 10.1186/1471-2164-15-S10-S7. Epub 2014 Dec 12.
8
PLNseq: a multivariate Poisson lognormal distribution for high-throughput matched RNA-sequencing read count data.PLNseq:一种用于高通量匹配RNA测序读数计数数据的多元泊松对数正态分布。
Stat Med. 2015 Apr 30;34(9):1577-89. doi: 10.1002/sim.6449. Epub 2015 Jan 30.
9
BALLI: Bartlett-adjusted likelihood-based linear model approach for identifying differentially expressed genes with RNA-seq data.BALLI:基于 Bartlett 调整似然比的线性模型方法,用于鉴定 RNA-seq 数据中差异表达的基因。
BMC Genomics. 2019 Jul 2;20(1):540. doi: 10.1186/s12864-019-5851-6.
10
Choice of library size normalization and statistical methods for differential gene expression analysis in balanced two-group comparisons for RNA-seq studies.RNA-seq 研究中平衡两组比较差异基因表达分析的库大小标准化和统计方法选择。
BMC Genomics. 2020 Jan 28;21(1):75. doi: 10.1186/s12864-020-6502-7.

引用本文的文献

1
Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.蛋白质组生成器:利用转录组与蛋白质组的不匹配来推断新型基因调控关系。
bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.
2
Warmer temperature during asexual reproduction induce methylome, transcriptomic, and lasting phenotypic changes in ecotypes.无性繁殖期间温度升高会引发生态型的甲基化组、转录组及持久的表型变化。
Hortic Res. 2023 Jul 31;10(9):uhad156. doi: 10.1093/hr/uhad156. eCollection 2023 Sep.
3
Analysis of the difference between early-bolting and non-bolting roots of Angelica dahurica based on transcriptome sequencing.

本文引用的文献

1
Empirical bayes analysis of sequencing-based transcriptional profiling without replicates.基于测序的转录谱学无重复的经验贝叶斯分析。
BMC Bioinformatics. 2010 Nov 16;11:564. doi: 10.1186/1471-2105-11-564.
2
A scaling normalization method for differential expression analysis of RNA-seq data.RNA-seq 数据差异表达分析的缩放标准化方法。
Genome Biol. 2010;11(3):R25. doi: 10.1186/gb-2010-11-3-r25. Epub 2010 Mar 2.
3
Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.mRNA-Seq 实验中标准化和差异表达的统计方法评估。
基于转录组测序的白芷早期抽薹和不抽薹根差异分析。
Sci Rep. 2023 May 15;13(1):7847. doi: 10.1038/s41598-023-34554-5.
4
Transcriptome analyses reveal the expression profile of genes related to lignan biosynthesis in L. Hoffm. Gen.转录组分析揭示了与北美鹅掌楸属植物中木脂素生物合成相关基因的表达谱。
Physiol Mol Biol Plants. 2022 Feb;28(2):333-346. doi: 10.1007/s12298-022-01156-w. Epub 2022 Mar 13.
5
An Oligomeric Sulfated Hyaluronan and Silk-Elastinlike Polymer Combination Protects against Murine Radiation Induced Proctitis.一种低聚硫酸化透明质酸与丝弹性蛋白样聚合物的组合可预防小鼠放射性直肠炎。
Pharmaceutics. 2022 Jan 12;14(1):175. doi: 10.3390/pharmaceutics14010175.
6
A candidate gene identified in converting platycoside E to platycodin D from Platycodon grandiflorus by transcriptome and main metabolites analysis.通过转录组和主要代谢物分析,从桔梗中转化远志皂苷 E 为远志皂苷 D 的候选基因鉴定。
Sci Rep. 2021 May 7;11(1):9810. doi: 10.1038/s41598-021-89294-1.
7
Impact of Sequencing Depth and Library Preparation on Toxicological Interpretation of RNA-Seq Data in a "Three-Sample" Scenario.测序深度和文库制备对“三样本”方案中 RNA-Seq 数据毒理学解释的影响。
Chem Res Toxicol. 2021 Feb 15;34(2):529-540. doi: 10.1021/acs.chemrestox.0c00368. Epub 2020 Dec 23.
8
Cross-omics analysis revealed gut microbiome-related metabolic pathways underlying atherosclerosis development after antibiotics treatment.跨组学分析揭示了抗生素治疗后动脉粥样硬化发展背后与肠道微生物群相关的代谢途径。
Mol Metab. 2020 Jun;36:100976. doi: 10.1016/j.molmet.2020.100976. Epub 2020 Mar 13.
9
Transcriptome analysis of Hua: identification of genes involved in polysaccharide biosynthesis.华的转录组分析:参与多糖生物合成的基因鉴定
Plant Methods. 2019 Jun 26;15:65. doi: 10.1186/s13007-019-0441-9. eCollection 2019.
10
Alterations in the intestinal microbiota of patients with severe and active Graves' orbitopathy: a cross-sectional study.严重和活动期格雷夫斯眼病患者肠道微生物群的改变:一项横断面研究。
J Endocrinol Invest. 2019 Aug;42(8):967-978. doi: 10.1007/s40618-019-1010-9. Epub 2019 Jan 23.
BMC Bioinformatics. 2010 Feb 18;11:94. doi: 10.1186/1471-2105-11-94.
4
Gene ontology analysis for RNA-seq: accounting for selection bias.RNA-seq 的基因本体分析:考虑选择偏差。
Genome Biol. 2010;11(2):R14. doi: 10.1186/gb-2010-11-2-r14. Epub 2010 Feb 4.
5
DEGseq: an R package for identifying differentially expressed genes from RNA-seq data.DEGseq:一个用于从 RNA-seq 数据中识别差异表达基因的 R 包。
Bioinformatics. 2010 Jan 1;26(1):136-8. doi: 10.1093/bioinformatics/btp612. Epub 2009 Oct 24.
6
Measuring differential gene expression by short read sequencing: quantitative comparison to 2-channel gene expression microarrays.通过短读测序测量差异基因表达:与双通道基因表达微阵列的定量比较。
BMC Genomics. 2009 May 12;10:221. doi: 10.1186/1471-2164-10-221.
7
Transcript length bias in RNA-seq data confounds systems biology.RNA测序数据中的转录本长度偏差会混淆系统生物学。
Biol Direct. 2009 Apr 16;4:14. doi: 10.1186/1745-6150-4-14.
8
Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling.使用核糖体谱分析在体内以核苷酸分辨率进行全基因组翻译分析。
Science. 2009 Apr 10;324(5924):218-23. doi: 10.1126/science.1168978. Epub 2009 Feb 12.
9
RNA-Seq: a revolutionary tool for transcriptomics.RNA测序:转录组学的革命性工具。
Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484.
10
RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays.RNA测序:技术可重复性评估及与基因表达阵列的比较
Genome Res. 2008 Sep;18(9):1509-17. doi: 10.1101/gr.079558.108. Epub 2008 Jun 11.