使用相关基序对多项研究中的差异基因表达进行联合分析。

Joint analysis of differential gene expression in multiple studies using correlation motifs.

作者信息

Wei Yingying, Tenzen Toyoaki, Ji Hongkai

机构信息

Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USADepartment of Statistics, The Chinese University of Hong Kong, Shatin NT, Hong Kong.

Center for Regenerative Medicine, Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02114, USA.

出版信息

Biostatistics. 2015 Jan;16(1):31-46. doi: 10.1093/biostatistics/kxu038. Epub 2014 Aug 19.

DOI:10.1093/biostatistics/kxu038

PMID:25143368

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4263229/

Abstract

The standard methods for detecting differential gene expression are mostly designed for analyzing a single gene expression experiment. When data from multiple related gene expression studies are available, separately analyzing each study is not ideal as it may fail to detect important genes with consistent but relatively weak differential signals in multiple studies. Jointly modeling all data allows one to borrow information across studies to improve the analysis. However, a simple concordance model, in which each gene is assumed to be differential in either all studies or none of the studies, is incapable of handling genes with study-specific differential expression. In contrast, a model that naively enumerates and analyzes all possible differential patterns across studies can deal with study-specificity and allow information pooling, but the complexity of its parameter space grows exponentially as the number of studies increases. Here, we propose a correlation motif approach to address this dilemma. This approach searches for a small number of latent probability vectors called correlation motifs to capture the major correlation patterns among multiple studies. The motifs provide the basis for sharing information among studies and genes. The approach has flexibility to handle all possible study-specific differential patterns. It improves detection of differential expression and overcomes the barrier of exponential model complexity.

摘要

检测差异基因表达的标准方法大多是为分析单个基因表达实验而设计的。当有多组相关基因表达研究的数据可用时，单独分析每个研究并不理想，因为这样可能无法检测到在多个研究中具有一致但相对较弱差异信号的重要基因。对所有数据进行联合建模可以让人们在不同研究之间借用信息以改进分析。然而，一个简单的一致性模型，即假设每个基因在所有研究中要么有差异，要么在所有研究中都没有差异，无法处理具有研究特异性差异表达的基因。相比之下，一个天真地枚举并分析所有可能的跨研究差异模式的模型可以处理研究特异性并允许信息合并，但其参数空间的复杂性会随着研究数量的增加呈指数增长。在这里，我们提出一种相关基序方法来解决这一困境。这种方法搜索少量称为相关基序的潜在概率向量，以捕捉多个研究之间的主要相关模式。这些基序为研究和基因之间共享信息提供了基础。该方法具有处理所有可能的研究特异性差异模式的灵活性。它提高了差异表达的检测能力，并克服了指数模型复杂性的障碍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5cdf/4263229/9dbae2f56099/kxu03801.jpg

相似文献

Joint analysis of differential gene expression in multiple studies using correlation motifs.

Biostatistics. 2015 Jan;16(1):31-46. doi: 10.1093/biostatistics/kxu038. Epub 2014 Aug 19.

Borrowing information across genes and experiments for improved error variance estimation in microarray data analysis.

Stat Appl Genet Mol Biol. 2012;11(3):Article 12. doi: 10.1515/1544-6115.1806.

Multivariate hierarchical Bayesian model for differential gene expression analysis in microarray experiments.

BMC Bioinformatics. 2008;9 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-9-S1-S9.

Flexible empirical Bayes models for differential gene expression.

Bioinformatics. 2007 Feb 1;23(3):328-35. doi: 10.1093/bioinformatics/btl612. Epub 2006 Nov 30.

Bayesian hierarchical error model for analysis of gene expression data.

Bioinformatics. 2004 Sep 1;20(13):2016-25. doi: 10.1093/bioinformatics/bth192. Epub 2004 Mar 25.

CAGER: classification analysis of gene expression regulation using multiple information sources.

BMC Bioinformatics. 2005 May 12;6:114. doi: 10.1186/1471-2105-6-114.

Gene expression analysis in clear cell renal cell carcinoma using gene set enrichment analysis for biostatistical management.

BJU Int. 2011 Jul;108(2 Pt 2):E29-35. doi: 10.1111/j.1464-410X.2010.09794.x. Epub 2011 Mar 16.

A hierarchical Naïve Bayes Model for handling sample heterogeneity in classification problems: an application to tissue microarrays.

BMC Bioinformatics. 2006 Nov 24;7:514. doi: 10.1186/1471-2105-7-514.

Calculation of the minimum number of replicate spots required for detection of significant gene expression fold change in microarray experiments.

Bioinformatics. 2002 Dec;18(12):1609-16. doi: 10.1093/bioinformatics/18.12.1609.

Hotelling's T2 multivariate profiling for detecting differential expression in microarrays.

Bioinformatics. 2005 Jul 15;21(14):3105-13. doi: 10.1093/bioinformatics/bti496. Epub 2005 May 19.

引用本文的文献

Anthracyclines induce global changes in cardiomyocyte chromatin accessibility that overlap with cardiovascular disease loci.

bioRxiv. 2025 Jun 16:2025.06.11.658997. doi: 10.1101/2025.06.11.658997.

Testing a Large Number of Composite Null Hypotheses Using Conditionally Symmetric Multidimensional Gaussian Mixtures in Genome-Wide Studies.

J Am Stat Assoc. 2025;120(550):605-617. doi: 10.1080/01621459.2024.2422124. Epub 2024 Dec 5.

Anthracyclines induce cardiotoxicity through a shared gene expression response signature.

PLoS Genet. 2024 Feb 28;20(2):e1011164. doi: 10.1371/journal.pgen.1011164. eCollection 2024 Feb.

The relationship between regulatory changes in cis and trans and the evolution of gene expression in humans and chimpanzees.

Genome Biol. 2023 Sep 11;24(1):207. doi: 10.1186/s13059-023-03019-3.

Evolutionary insights into primate skeletal gene regulation using a comparative cell culture model.

PLoS Genet. 2022 Mar 9;18(3):e1010073. doi: 10.1371/journal.pgen.1010073. eCollection 2022 Mar.

Dynamic effects of genetic variation on gene expression revealed following hypoxic stress in cardiomyocytes.

Elife. 2021 Feb 8;10:e57345. doi: 10.7554/eLife.57345.

Pan-cancer analysis of differential DNA methylation patterns.

BMC Med Genomics. 2020 Oct 22;13(Suppl 10):154. doi: 10.1186/s12920-020-00780-3.

Primo: integration of multiple GWAS and omics QTL summary statistics for elucidation of molecular mechanisms of trait-associated SNPs and detection of pleiotropy in complex traits.

Genome Biol. 2020 Sep 11;21(1):236. doi: 10.1186/s13059-020-02125-w.

mTADA is a framework for identifying risk genes from de novo mutations in multiple traits.

Nat Commun. 2020 Jun 10;11(1):2929. doi: 10.1038/s41467-020-16487-z.

A generally conserved response to hypoxia in iPSC-derived cardiomyocytes from humans and chimpanzees.

Elife. 2019 Apr 8;8:e42374. doi: 10.7554/eLife.42374.

本文引用的文献

An empirical Bayes' approach to joint analysis of multiple microarray gene expression studies.

Biometrics. 2011 Dec;67(4):1617-26. doi: 10.1111/j.1541-0420.2011.01602.x. Epub 2011 Apr 22.

A Bayesian model for cross-study differential gene expression.

J Am Stat Assoc. 2009;104(488):1295-1310. doi: 10.1198/jasa.2009.ap07611.

Differential expression analysis for sequence count data.

Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.

A genome-scale analysis of the cis-regulatory circuitry underlying sonic hedgehog-mediated patterning of the mammalian limb.

Genes Dev. 2008 Oct 1;22(19):2651-63. doi: 10.1101/gad.1693008.

Moderated statistical tests for assessing differences in tag abundance.

Bioinformatics. 2007 Nov 1;23(21):2881-7. doi: 10.1093/bioinformatics/btm453. Epub 2007 Sep 19.

Small-sample estimation of negative binomial dispersion, with applications to SAGE data.

Biostatistics. 2008 Apr;9(2):321-32. doi: 10.1093/biostatistics/kxm030. Epub 2007 Aug 29.

Genomic characterization of Gli-activator targets in sonic hedgehog-mediated neural patterning.

Development. 2007 May;134(10):1977-89. doi: 10.1242/dev.001966. Epub 2007 Apr 18.

A unified approach for simultaneous gene clustering and differential expression identification.

Biometrics. 2006 Dec;62(4):1089-98. doi: 10.1111/j.1541-0420.2006.00611.x.

A novel somatic mouse model to survey tumorigenic potential applied to the Hedgehog pathway.

Cancer Res. 2006 Oct 15;66(20):10171-8. doi: 10.1158/0008-5472.CAN-06-0657.

Bayesian models for pooling microarray studies with multiple sources of replications.

BMC Bioinformatics. 2006 May 5;7:247. doi: 10.1186/1471-2105-7-247.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用相关基序对多项研究中的差异基因表达进行联合分析。

Joint analysis of differential gene expression in multiple studies using correlation motifs.

作者信息

Wei Yingying, Tenzen Toyoaki, Ji Hongkai

机构信息

Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USADepartment of Statistics, The Chinese University of Hong Kong, Shatin NT, Hong Kong.

Center for Regenerative Medicine, Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA 02114, USA.

出版信息

Biostatistics. 2015 Jan;16(1):31-46. doi: 10.1093/biostatistics/kxu038. Epub 2014 Aug 19.

DOI:10.1093/biostatistics/kxu038

PMID:25143368

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4263229/

Abstract

摘要

使用相关基序对多项研究中的差异基因表达进行联合分析。

Joint analysis of differential gene expression in multiple studies using correlation motifs.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

使用相关基序对多项研究中的差异基因表达进行联合分析。

Joint analysis of differential gene expression in multiple studies using correlation motifs.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献