基于遗传协方差矩阵贝叶斯稀疏因子分析的高维表型剖析。

Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.

机构信息

Department of Biology, Duke University, Durham, North Carolina 27708, USA.

出版信息

Genetics. 2013 Jul;194(3):753-67. doi: 10.1534/genetics.113.151217. Epub 2013 May 1.

DOI:10.1534/genetics.113.151217

PMID:23636737

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3697978/

Abstract

Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism's entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse - affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.

摘要

定量遗传学研究对进化预测和人工选择都很重要，因为它可以对复杂的多变量表型进行建模。例如，基因表达的变化可以深入了解基因型和表型之间的发育和生理机制。然而，经典的分析技术在对基因表达进行定量遗传研究时效果不佳，因为每个个体所测量的性状数量可能达到数千个。在这里，我们推导出了一个贝叶斯遗传稀疏因子模型，用于在混合效应模型中估计高维性状（如基因表达）的遗传协方差矩阵（G-矩阵）。我们模型的关键思想是，我们只需要考虑在生物学上合理的 G-矩阵。一个生物体的整个表型是由模块化和有限复杂性的过程产生的。这意味着 G-矩阵将具有高度的结构性。具体来说，我们假设只有有限数量的中间性状（或因子，例如发育或生理学上的变化）控制着高维表型的变化，并且每个中间性状都是稀疏的——只影响少数观察到的性状。这种方法有两个优点。首先，稀疏因子具有可解释性，并为遗传结构的基础机制提供了生物学见解。其次，强制稀疏有助于防止采样误差淹没高维数据中的真实信号。我们在模拟数据和对已发表的黑腹果蝇基因表达数据集的分析中展示了我们模型的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6e7/3697978/0bbc5d84d948/753fig1.jpg

相似文献

Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.基于遗传协方差矩阵贝叶斯稀疏因子分析的高维表型剖析。

Genetics. 2013 Jul;194(3):753-67. doi: 10.1534/genetics.113.151217. Epub 2013 May 1.

Structure and stability of genetic variance-covariance matrices: A Bayesian sparse factor analysis of transcriptional variation in the three-spined stickleback.遗传方差协方差矩阵的结构与稳定性：三刺鱼转录变异的贝叶斯稀疏因子分析

Mol Ecol. 2017 Oct;26(19):5099-5113. doi: 10.1111/mec.14265. Epub 2017 Aug 21.

Maintenance of quantitative genetic variance in complex, multitrait phenotypes: the contribution of rare, large effect variants in 2 Drosophila species.在复杂的多性状表型中维持数量遗传方差：2 种果蝇中稀有、大效应变异的贡献。

Genetics. 2022 Sep 30;222(2). doi: 10.1093/genetics/iyac122.

Survey of the Heritability and Sparse Architecture of Gene Expression Traits across Human Tissues.人类组织中基因表达性状的遗传力和稀疏结构研究。

PLoS Genet. 2016 Nov 11;12(11):e1006423. doi: 10.1371/journal.pgen.1006423. eCollection 2016 Nov.

Exploring a Bayesian sparse factor model-based strategy for the genetic analysis of thousands of mid-infrared spectra traits for animal breeding.探索基于贝叶斯稀疏因子模型的策略，用于对动物育种的数千个中红外光谱性状进行遗传分析。

J Dairy Sci. 2024 Nov;107(11):9615-9627. doi: 10.3168/jds.2023-24319. Epub 2024 Jul 4.

Comparing G: multivariate analysis of genetic variation in multiple populations.比较 G：多群体遗传变异的多元分析。

Heredity (Edinb). 2014 Jan;112(1):21-9. doi: 10.1038/hdy.2013.12. Epub 2013 Mar 13.

Estimating sampling error of evolutionary statistics based on genetic covariance matrices using maximum likelihood.基于遗传协方差矩阵使用最大似然法估计进化统计学的抽样误差。

J Evol Biol. 2015 Aug;28(8):1542-9. doi: 10.1111/jeb.12674. Epub 2015 Jul 21.

Erratum: High-Throughput Identification of Resistance to Pseudomonas syringae pv. Tomato in Tomato using Seedling Flood Assay.勘误：利用幼苗浸没法高通量鉴定番茄对丁香假单胞菌 pv.番茄的抗性。

J Vis Exp. 2023 Oct 18(200). doi: 10.3791/6576.

A Multiple-Trait Bayesian Lasso for Genome-Enabled Analysis and Prediction of Complex Traits.用于基于基因组的复杂性状分析与预测的多性状贝叶斯套索法

Genetics. 2020 Feb;214(2):305-331. doi: 10.1534/genetics.119.302934. Epub 2019 Dec 26.

Factor analysis for gene regulatory networks and transcription factor activity profiles.基因调控网络和转录因子活性谱的因子分析

BMC Bioinformatics. 2007 Feb 23;8:61. doi: 10.1186/1471-2105-8-61.

引用本文的文献

Measuring natural selection on the transcriptome.测量转录组上的自然选择。

New Phytol. 2025 Sep;247(5):1994-2002. doi: 10.1111/nph.70287. Epub 2025 Jun 5.

Multi-response phylogenetic mixed models: concepts and application.多响应系统发育混合模型：概念与应用

Biol Rev Camb Philos Soc. 2025 Jun;100(3):1294-1316. doi: 10.1111/brv.70001. Epub 2025 Apr 7.

MegaLMM improves genomic predictions in new environments using environmental covariates.MegaLMM利用环境协变量改进新环境中的基因组预测。

Genetics. 2025 Jan 8;229(1):1-41. doi: 10.1093/genetics/iyae171.

Principal component analysis revisited: fast multitrait genetic evaluations with smooth convergence.重新审视主成分分析：具有平滑收敛性的快速多性状遗传评估

G3 (Bethesda). 2024 Oct 21;14(12). doi: 10.1093/g3journal/jkae228.

High-dimensional multi-omics measured in controlled conditions are useful for maize platform and field trait predictions.在受控条件下测量的高维多组学数据可用于玉米平台和田间性状预测。

Theor Appl Genet. 2024 Jul 3;137(7):175. doi: 10.1007/s00122-024-04679-w.

Mega-scale Bayesian regression methods for genome-wide prediction and association studies with thousands of traits.大规模贝叶斯回归方法在全基因组预测和关联研究中的应用，涉及数千个性状。

Genetics. 2023 Mar 2;223(3). doi: 10.1093/genetics/iyac183.

Evolvability and constraint in the evolution of three-dimensional flower morphology.三维花形态进化中的可进化性和约束。

Am J Bot. 2022 Nov;109(11):1906-1917. doi: 10.1002/ajb2.16092. Epub 2022 Nov 13.

Genetics. 2022 Sep 30;222(2). doi: 10.1093/genetics/iyac122.

MegaLMM: Mega-scale linear mixed models for genomic predictions with thousands of traits.MegaLMM：用于具有数千个性状的基因组预测的大规模线性混合模型。

Genome Biol. 2021 Jul 23;22(1):213. doi: 10.1186/s13059-021-02416-w.

Improving Genomic Prediction for Seed Quality Traits in Oat (Avena sativa L.) Using Trait-Specific Relationship Matrices.利用特定性状关系矩阵改进燕麦（Avena sativa L.）种子品质性状的基因组预测

Front Genet. 2021 Mar 31;12:643733. doi: 10.3389/fgene.2021.643733. eCollection 2021.

本文引用的文献

THE MEASUREMENT OF SELECTION ON QUANTITATIVE TRAITS: BIASES DUE TO ENVIRONMENTAL COVARIANCES BETWEEN TRAITS AND FITNESS.数量性状选择的测量：性状与适合度之间环境协方差导致的偏差

Evolution. 1992 Jun;46(3):616-626. doi: 10.1111/j.1558-5646.1992.tb02070.x.

QUANTITATIVE GENETIC ANALYSIS OF MULTIVARIATE EVOLUTION, APPLIED TO BRAIN:BODY SIZE ALLOMETRY.多变量进化的定量遗传分析，应用于脑体大小异速生长

Evolution. 1979 Mar;33(1Part2):402-416. doi: 10.1111/j.1558-5646.1979.tb04694.x.

ESTIMATION OF AVERAGE FITNESS OF POPULATIONS OF DROSOPHILA MELANOGASTER AND THE EVOLUTION OF FITNESS IN EXPERIMENTAL POPULATIONS.黑腹果蝇种群平均适合度的估计及实验种群中适合度的进化

Evolution. 1979 Mar;33(1Part2):371-380. doi: 10.1111/j.1558-5646.1979.tb04690.x.

ADAPTIVE RADIATION ALONG GENETIC LINES OF LEAST RESISTANCE.沿阻力最小遗传路线的适应性辐射

Evolution. 1996 Oct;50(5):1766-1774. doi: 10.1111/j.1558-5646.1996.tb03563.x.

PERSPECTIVE: COMPLEX ADAPTATIONS AND THE EVOLUTION OF EVOLVABILITY.视角：复杂适应与进化能力的演变

Evolution. 1996 Jun;50(3):967-976. doi: 10.1111/j.1558-5646.1996.tb02339.x.

Generalized Beta Mixtures of Gaussians.广义高斯混合贝塔分布

Adv Neural Inf Process Syst. 2011;24:523-531.

Sparse Bayesian infinite factor models.稀疏贝叶斯无限因子模型

Biometrika. 2011 Jun;98(2):291-306. doi: 10.1093/biomet/asr013.

Genome-wide efficient mixed-model analysis for association studies.全基因组高效混合模型关联分析。

Nat Genet. 2012 Jun 17;44(7):821-4. doi: 10.1038/ng.2310.

High-dimensional variance partitioning reveals the modular genetic basis of adaptive divergence in gene expression during reproductive character displacement.高维方差分解揭示了生殖特征替代过程中基因表达适应性分歧的模块化遗传基础。

Evolution. 2011 Nov;65(11):3126-37. doi: 10.1111/j.1558-5646.2011.01371.x. Epub 2011 Jun 27.

Sparse High Dimensional Models in Economics.经济学中的稀疏高维模型。

Annu Rev Econom. 2011 Sep;3:291-317. doi: 10.1146/annurev-economics-061109-080451.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于遗传协方差矩阵贝叶斯稀疏因子分析的高维表型剖析。

Dissecting high-dimensional phenotypes with bayesian sparse factor analysis of genetic covariance matrices.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献