混合模型共表达：在考虑表达异质性的情况下计算基因共表达。

Mixed-model coexpression: calculating gene coexpression while accounting for expression heterogeneity.

机构信息

Department of Computer Science, University of California, Los Angeles, CA 90024, USA.

出版信息

Bioinformatics. 2011 Jul 1;27(13):i288-94. doi: 10.1093/bioinformatics/btr221.

DOI:10.1093/bioinformatics/btr221

PMID:21685083

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3117390/

Abstract

MOTIVATION

The analysis of gene coexpression is at the core of many types of genetic analysis. The coexpression between two genes can be calculated by using a traditional Pearson's correlation coefficient. However, unobserved confounding effects may cause inflation of the Pearson's correlation so that uncorrelated genes appear correlated. Many general methods have been suggested, which aim to remove the effects of confounding from gene expression data. However, the residual confounding which is not accounted for by these generic correction procedures has the potential to induce correlation between genes. Therefore, a method that specifically aims to calculate gene coexpression between gene expression arrays, while accounting for confounding effects, is desirable.

RESULTS

In this article, we present a statistical model for calculating gene coexpression called mixed model coexpression (MMC), which models coexpression within a mixed model framework. Confounding effects are expected to be encoded in the matrix representing the correlation between arrays, the inter-sample correlation matrix. By conditioning on the information in the inter-sample correlation matrix, MMC is able to produce gene coexpressions that are not influenced by global confounding effects and thus significantly reduce the number of spurious coexpressions observed. We applied MMC to both human and yeast datasets and show it is better able to effectively prioritize strong coexpressions when compared to a traditional Pearson's correlation and a Pearson's correlation applied to data corrected with surrogate variable analysis (SVA).

AVAILABILITY

The method is implemented in the R programming language and may be found at http://genetics.cs.ucla.edu/mmc.

CONTACT

nfurlott@cs.ucla.edu; eeskin@cs.ucla.edu.

摘要

动机

基因共表达分析是许多类型的遗传分析的核心。两个基因之间的共表达可以使用传统的皮尔逊相关系数来计算。然而，未观察到的混杂效应可能会导致皮尔逊相关系数的膨胀，从而使不相关的基因表现出相关性。已经提出了许多通用方法，旨在从基因表达数据中去除混杂效应的影响。然而，这些通用校正程序未考虑到的残留混杂因素有可能导致基因之间的相关性。因此，需要一种专门旨在计算基因表达阵列之间基因共表达的方法，同时考虑混杂效应。

结果

在本文中，我们提出了一种称为混合模型共表达（MMC）的计算基因共表达的统计模型，该模型在混合模型框架内对共表达进行建模。混杂效应预计将被编码在表示阵列之间相关性的矩阵中，即样本间相关矩阵中。通过对样本间相关矩阵中的信息进行条件化，MMC 能够产生不受全局混杂效应影响的基因共表达，从而显著减少观察到的虚假共表达数量。我们将 MMC 应用于人类和酵母数据集，并表明与传统的皮尔逊相关系数和应用于替代变量分析（SVA）校正后数据的皮尔逊相关系数相比，它能够更有效地优先考虑强共表达。

可用性

该方法在 R 编程语言中实现，可在 http://genetics.cs.ucla.edu/mmc 找到。

联系方式

nfurlott@cs.ucla.edu；eeskin@cs.ucla.edu。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8694/3117390/c1d9c4e9a151/btr221f1.jpg

相似文献

Mixed-model coexpression: calculating gene coexpression while accounting for expression heterogeneity.混合模型共表达：在考虑表达异质性的情况下计算基因共表达。

Bioinformatics. 2011 Jul 1;27(13):i288-94. doi: 10.1093/bioinformatics/btr221.

Rank of correlation coefficient as a comparable measure for biological significance of gene coexpression.相关系数等级可作为基因共表达生物学意义的可比度量。

DNA Res. 2009 Oct;16(5):249-60. doi: 10.1093/dnares/dsp016. Epub 2009 Sep 18.

Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis.基因共表达的多维相关性及其在拟南芥大规模数据中的应用。

Bioinformatics. 2009 Oct 15;25(20):2677-84. doi: 10.1093/bioinformatics/btp442. Epub 2009 Jul 20.

A biologically inspired measure for coexpression analysis.一种基于生物学的共表达分析度量方法。

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jul-Aug;8(4):929-42. doi: 10.1109/TCBB.2010.106.

Local coexpression domains of two to four genes in the genome of Arabidopsis.拟南芥基因组中两到四个基因的局部共表达结构域。

Plant Physiol. 2005 Jun;138(2):923-34. doi: 10.1104/pp.104.055673. Epub 2005 May 27.

Meta-analytic framework for modeling genetic coexpression dynamics.用于模拟基因共表达动态的荟萃分析框架。

Stat Appl Genet Mol Biol. 2019 Feb 9;18(1):/j/sagmb.2019.18.issue-1/sagmb-2017-0052/sagmb-2017-0052.xml. doi: 10.1515/sagmb-2017-0052.

Human gene coexpression landscape: confident network derived from tissue transcriptomic profiles.人类基因共表达图谱：源自组织转录组图谱的可靠网络。

PLoS One. 2008;3(12):e3911. doi: 10.1371/journal.pone.0003911. Epub 2008 Dec 15.

Inference of Gene Coexpression Networks from Bulk-Based RNA-Sequencing Data.从基于批量的 RNA 测序数据推断基因共表达网络。

Methods Mol Biol. 2021;2328:13-23. doi: 10.1007/978-1-0716-1534-8_2.

MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation.MIrExpress：基于互信息和皮尔逊相关系数的免疫细胞基因共表达相关性数据库。

J Immunol Res. 2015;2015:140819. doi: 10.1155/2015/140819. Epub 2015 Dec 3.

Subspace differential coexpression analysis: problem definition and a general approach.子空间微分共表达分析：问题定义与通用方法。

Pac Symp Biocomput. 2010:145-56.

引用本文的文献

Accelerating crop improvement via integration of transcriptome-based network biology and genome editing.通过整合基于转录组的网络生物学和基因组编辑加速作物改良。

Planta. 2025 Mar 17;261(4):92. doi: 10.1007/s00425-025-04666-5.

Co-expression of miRNA players in advanced laryngeal carcinoma - Insights into the roles of miR-93-5p, miR-145-5p, and miR-210-3p.晚期喉癌中miRNA相关因子的共表达——对miR-93-5p、miR-145-5p和miR-210-3p作用的深入了解

Biomol Biomed. 2025 Apr 3;25(5):1052-1062. doi: 10.17305/bb.2024.10947.

Decoding CPK/SnRK Superfamily Kinase Client Signaling Networks Using Peptide Library and Mass Spectrometry.利用肽库和质谱法解码肌酸磷酸激酶/蔗糖非发酵相关蛋白激酶超家族激酶的客户信号网络

Plants (Basel). 2024 May 27;13(11):1481. doi: 10.3390/plants13111481.

CoRegNet: unraveling gene co-regulation networks from public RNA-Seq repositories using a beta-binomial statistical model.CoRegNet：利用贝塔二项式统计模型从公共 RNA-Seq 存储库中解析基因共调控网络。

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad380.

Assessment of DDAH1 and DDAH2 Contributions to Psychiatric Disorders via In Silico Methods.通过计算方法评估 DDAH1 和 DDAH2 对精神疾病的贡献。

Int J Mol Sci. 2022 Oct 7;23(19):11902. doi: 10.3390/ijms231911902.

Computational principles and challenges in single-cell data integration.单细胞数据整合的计算原理与挑战。

Nat Biotechnol. 2021 Oct;39(10):1202-1215. doi: 10.1038/s41587-021-00895-7. Epub 2021 May 3.

Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities.微生物组多组学网络分析：统计考量、局限性与机遇

Front Genet. 2019 Nov 8;10:995. doi: 10.3389/fgene.2019.00995. eCollection 2019.

HbA is associated with altered expression in blood of cell cycle- and immune response-related genes.HbA 与细胞周期和免疫反应相关基因在血液中的表达改变有关。

Diabetologia. 2018 Jan;61(1):138-146. doi: 10.1007/s00125-017-4467-0. Epub 2017 Nov 20.

The Detection of Metabolite-Mediated Gene Module Co-Expression Using Multivariate Linear Models.使用多元线性模型检测代谢物介导的基因模块共表达

PLoS One. 2016 Feb 26;11(2):e0150257. doi: 10.1371/journal.pone.0150257. eCollection 2016.

A null model for Pearson coexpression networks.用于Pearson共表达网络的零模型。

PLoS One. 2015 Jun 1;10(6):e0128115. doi: 10.1371/journal.pone.0128115. eCollection 2015.

本文引用的文献

A general framework for multiple testing dependence.多重检验相关性的通用框架。

Proc Natl Acad Sci U S A. 2008 Dec 2;105(48):18718-23. doi: 10.1073/pnas.0808709105. Epub 2008 Nov 24.

Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots.在来自虚假和真实调控热点的混杂因素影响下准确发现表达数量性状基因座。

Genetics. 2008 Dec;180(4):1909-25. doi: 10.1534/genetics.108.094201. Epub 2008 Sep 14.

Gene-environment interaction in yeast gene expression.酵母基因表达中的基因-环境相互作用。

PLoS Biol. 2008 Apr 15;6(4):e83. doi: 10.1371/journal.pbio.0060083.

The properties of high-dimensional data spaces: implications for exploring gene and protein expression data.高维数据空间的特性：对探索基因和蛋白质表达数据的启示

Nat Rev Cancer. 2008 Jan;8(1):37-49. doi: 10.1038/nrc2294.

Capturing heterogeneity in gene expression studies by surrogate variable analysis.通过替代变量分析在基因表达研究中捕捉异质性。

PLoS Genet. 2007 Sep;3(9):1724-35. doi: 10.1371/journal.pgen.0030161. Epub 2007 Aug 1.

Identifying regulatory mechanisms using individual variation reveals key role for chromatin modification.利用个体差异识别调控机制揭示了染色质修饰的关键作用。

Proc Natl Acad Sci U S A. 2006 Sep 19;103(38):14062-7. doi: 10.1073/pnas.0601852103. Epub 2006 Sep 12.

Integrating genetic and network analysis to characterize genes related to mouse weight.整合基因与网络分析以表征与小鼠体重相关的基因。

PLoS Genet. 2006 Aug 18;2(8):e130. doi: 10.1371/journal.pgen.0020130. Epub 2006 Jul 5.

Adjusting batch effects in microarray expression data using empirical Bayes methods.使用经验贝叶斯方法调整微阵列表达数据中的批次效应。

Biostatistics. 2007 Jan;8(1):118-27. doi: 10.1093/biostatistics/kxj037. Epub 2006 Apr 21.

Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.基因集富集分析：一种基于知识的方法用于解读全基因组表达谱。

Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. doi: 10.1073/pnas.0506580102. Epub 2005 Sep 30.

The effects of normalization on the correlation structure of microarray data.标准化对微阵列数据相关结构的影响。

BMC Bioinformatics. 2005 May 16;6:120. doi: 10.1186/1471-2105-6-120.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

混合模型共表达：在考虑表达异质性的情况下计算基因共表达。

Mixed-model coexpression: calculating gene coexpression while accounting for expression heterogeneity.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

CONTACT

动机

结果

可用性

联系方式

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献