基于与定量表型的关联对表达基因进行聚类。

Clustering expressed genes on the basis of their association with a quantitative phenotype.

作者信息

Jia Zhenyu, Xu Shizhong

机构信息

Department of Botany and Plant Sciences, University of California, Riverside, 92521, USA.

出版信息

Genet Res. 2005 Dec;86(3):193-207. doi: 10.1017/S0016672305007822.

DOI:10.1017/S0016672305007822

PMID:16454859

Abstract

Cluster analyses of gene expression data are usually conducted based on their associations with the phenotype of a particular disease. Many disease traits have a clearly defined binary phenotype (presence or absence), so that genes can be clustered based on the differences of expression levels between the two contrasting phenotypic groups. For example, cluster analysis based on binary phenotype has been successfully used in tumour research. Some complex diseases have phenotypes that vary in a continuous manner and the method developed for a binary trait is not immediately applicable to a continuous trait. However, understanding the role of gene expression in these complex traits is of fundamental importance. Therefore, it is necessary to develop a new statistical method to cluster expressed genes based on their association with a quantitative trait phenotype. We developed a model-based clustering method to classify genes based on their association with a continuous phenotype. We used a linear model to describe the relationship between gene expression and the phenotypic value. The model effects of the linear model (linear regression coefficients) represent the strength of the association. We assumed that the model effects of each gene follow a mixture of several multivariate Gaussian distributions. Parameter estimation and cluster assignment were accomplished via an Expectation-Maximization (EM) algorithm. The method was verified by analysing two simulated datasets, and further demonstrated using real data generated in a microarray experiment for the study of gene expression associated with Alzheimer's disease.

摘要

基因表达数据的聚类分析通常基于它们与特定疾病表型的关联来进行。许多疾病特征具有明确界定的二元表型（存在或不存在），这样基因就可以根据两个对比表型组之间表达水平的差异进行聚类。例如，基于二元表型的聚类分析已成功应用于肿瘤研究。一些复杂疾病具有以连续方式变化的表型，而为二元性状开发的方法不能直接应用于连续性状。然而，了解基因表达在这些复杂性状中的作用至关重要。因此，有必要开发一种新的统计方法，根据表达基因与数量性状表型的关联对其进行聚类。我们开发了一种基于模型的聚类方法，根据基因与连续表型的关联对基因进行分类。我们使用线性模型来描述基因表达与表型值之间的关系。线性模型的模型效应（线性回归系数）代表关联的强度。我们假设每个基因的模型效应遵循几种多元高斯分布的混合。参数估计和聚类分配通过期望最大化（EM）算法完成。该方法通过分析两个模拟数据集进行了验证，并使用在微阵列实验中生成的真实数据进一步证明，该实验用于研究与阿尔茨海默病相关的基因表达。

相似文献

Clustering expressed genes on the basis of their association with a quantitative phenotype.

Genet Res. 2005 Dec;86(3):193-207. doi: 10.1017/S0016672305007822.

Quantitative trait associated microarray gene expression data analysis.

Mol Biol Evol. 2006 Aug;23(8):1558-73. doi: 10.1093/molbev/msl019. Epub 2006 May 26.

A mixture model with random-effects components for clustering correlated gene-expression profiles.

Bioinformatics. 2006 Jul 15;22(14):1745-52. doi: 10.1093/bioinformatics/btl165. Epub 2006 May 3.

Clustering microarray gene expression data using weighted Chinese restaurant process.

Bioinformatics. 2006 Aug 15;22(16):1988-97. doi: 10.1093/bioinformatics/btl284. Epub 2006 Jun 9.

Clustering of change patterns using Fourier coefficients.

Bioinformatics. 2008 Jan 15;24(2):184-91. doi: 10.1093/bioinformatics/btm568. Epub 2007 Nov 19.

A GMM-IG framework for selecting genes as expression panel biomarkers.

Artif Intell Med. 2010 Feb-Mar;48(2-3):75-82. doi: 10.1016/j.artmed.2009.07.006. Epub 2009 Dec 8.

Incorporating gene functions as priors in model-based clustering of microarray gene expression data.

Bioinformatics. 2006 Apr 1;22(7):795-801. doi: 10.1093/bioinformatics/btl011. Epub 2006 Jan 24.

Gene-environment interactions in complex diseases: genetic models and methods for QTL mapping in multiple half-sib populations.

Genet Res. 2006 Oct;88(2):119-31. doi: 10.1017/S0016672306008391. Epub 2006 Sep 15.

Mapping binary trait loci in the F(2:3) design.

J Hered. 2007 Jul-Aug;98(4):337-44. doi: 10.1093/jhered/esm041. Epub 2007 Jul 10.

Mapping quantitative trait loci for traits defined as ratios.

Genetica. 2008 Mar;132(3):323-9. doi: 10.1007/s10709-007-9175-0. Epub 2007 Aug 2.

引用本文的文献

A novel temperature-humidity index-based model for evaluating semen characteristics in Thai native roosters under tropical conditions.

Poult Sci. 2025 May 20;104(8):105321. doi: 10.1016/j.psj.2025.105321.

A genetical genomics approach to genome scans increases power for QTL mapping.

Genetics. 2011 Mar;187(3):939-53. doi: 10.1534/genetics.110.123968. Epub 2010 Dec 31.

Clustering of gene expression data and end-point measurements by simulated annealing.

J Bioinform Comput Biol. 2009 Feb;7(1):193-215. doi: 10.1142/s021972000900400x.

Bayesian mixture model analysis for detecting differentially expressed genes.

Int J Plant Genomics. 2008;2008:892927. doi: 10.1155/2008/892927.

A hierarchical approach employing metabolic and gene expression profiles to identify the pathways that confer cytotoxicity in HepG2 cells.

BMC Syst Biol. 2007 May 11;1:21. doi: 10.1186/1752-0509-1-21.

Probe-level linear model fitting and mixture modeling results in high accuracy detection of differential gene expression.

BMC Bioinformatics. 2006 Aug 25;7:391. doi: 10.1186/1471-2105-7-391.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于与定量表型的关联对表达基因进行聚类。

Clustering expressed genes on the basis of their association with a quantitative phenotype.

作者信息

Jia Zhenyu, Xu Shizhong

机构信息

Department of Botany and Plant Sciences, University of California, Riverside, 92521, USA.

出版信息

Genet Res. 2005 Dec;86(3):193-207. doi: 10.1017/S0016672305007822.

DOI:10.1017/S0016672305007822

PMID:16454859

Abstract

摘要

基于与定量表型的关联对表达基因进行聚类。

Clustering expressed genes on the basis of their association with a quantitative phenotype.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于与定量表型的关联对表达基因进行聚类。

Clustering expressed genes on the basis of their association with a quantitative phenotype.

作者信息

机构信息

出版信息

相似文献

引用本文的文献