Suppr超能文献

遗传和非遗传变异揭示了人类基因表达的主要成分。

Genetic and nongenetic variation revealed for the principal components of human gene expression.

机构信息

University of Queensland Diamantina Institute, The Translational Research Institute, Brisbane, Queensland 4102, Australia.

出版信息

Genetics. 2013 Nov;195(3):1117-28. doi: 10.1534/genetics.113.153221. Epub 2013 Sep 11.

Abstract

Principal components analysis has been employed in gene expression studies to correct for population substructure and batch and environmental effects. This method typically involves the removal of variation contained in as many as 50 principal components (PCs), which can constitute a large proportion of total variation present in the data. Each PC, however, can detect many sources of variation, including gene expression networks and genetic variation influencing transcript levels. We demonstrate that PCs generated from gene expression data can simultaneously contain both genetic and nongenetic factors. From heritability estimates we show that all PCs contain a considerable portion of genetic variation while nongenetic artifacts such as batch effects were associated to varying degrees with the first 60 PCs. These PCs demonstrate an enrichment of biological pathways, including core immune function and metabolic pathways. The use of PC correction in two independent data sets resulted in a reduction in the number of cis- and trans-expression QTL detected. Comparisons of PC and linear model correction revealed that PC correction was not as efficient at removing known batch effects and had a higher penalty on genetic variation. Therefore, this study highlights the danger of eliminating biologically relevant data when employing PC correction in gene expression data.

摘要

主成分分析已被应用于基因表达研究中,以校正群体亚结构、批次和环境效应。该方法通常涉及去除多达 50 个主成分(PC)中的变异,这些变异可能构成数据中总变异的很大一部分。然而,每个 PC 都可以检测到许多变异来源,包括基因表达网络和影响转录水平的遗传变异。我们证明了从基因表达数据中生成的 PC 可以同时包含遗传和非遗传因素。从遗传力估计中我们可以看出,所有的 PC 都包含了相当一部分的遗传变异,而批次效应等非遗传伪影则与前 60 个 PC 有不同程度的关联。这些 PC 显示出生物途径的富集,包括核心免疫功能和代谢途径。在两个独立的数据集中使用 PC 校正导致 cis 和 trans 表达 QTL 的数量减少。PC 校正和线性模型校正的比较表明,PC 校正去除已知批次效应的效率不如线性模型校正,并且对遗传变异的惩罚更高。因此,本研究强调了在基因表达数据中使用 PC 校正时消除生物学相关数据的危险。

相似文献

1
Genetic and nongenetic variation revealed for the principal components of human gene expression.
Genetics. 2013 Nov;195(3):1117-28. doi: 10.1534/genetics.113.153221. Epub 2013 Sep 11.
4
On the substructure controls in rare variant analysis: Principal components or variance components?
Genet Epidemiol. 2018 Apr;42(3):276-287. doi: 10.1002/gepi.22102. Epub 2017 Dec 26.
6
Exploring pleiotropy using principal components.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S53. doi: 10.1186/1471-2156-4-S1-S53.
10
A ridge penalized principal-components approach based on heritability for high-dimensional data.
Hum Hered. 2007;64(3):182-91. doi: 10.1159/000102991. Epub 2007 May 25.

引用本文的文献

1
Trans Effects on Gene Expression Can Drive Omnigenic Inheritance.
Cell. 2019 May 2;177(4):1022-1034.e6. doi: 10.1016/j.cell.2019.04.014.
2
An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci.
PLoS Comput Biol. 2017 May 15;13(5):e1005537. doi: 10.1371/journal.pcbi.1005537. eCollection 2017 May.
3
Autosomal genetic control of human gene expression does not differ across the sexes.
Genome Biol. 2016 Dec 1;17(1):248. doi: 10.1186/s13059-016-1111-0.
6
Seasonal effects on gene expression.
PLoS One. 2015 May 29;10(5):e0126995. doi: 10.1371/journal.pone.0126995. eCollection 2015.
7
Evaluating intra- and inter-individual variation in the human placental transcriptome.
Genome Biol. 2015 Mar 19;16(1):54. doi: 10.1186/s13059-015-0627-z.
8
The role of regulatory variation in complex traits and disease.
Nat Rev Genet. 2015 Apr;16(4):197-212. doi: 10.1038/nrg3891. Epub 2015 Feb 24.
9
LINE-1 methylation in granulocyte DNA and trihalomethane exposure is associated with bladder cancer risk.
Epigenetics. 2014 Nov;9(11):1532-9. doi: 10.4161/15592294.2014.983377.
10
Regularized machine learning in the genetic prediction of complex traits.
PLoS Genet. 2014 Nov 13;10(11):e1004754. doi: 10.1371/journal.pgen.1004754. eCollection 2014 Nov.

本文引用的文献

1
Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs.
PLoS Genet. 2013;9(8):e1003649. doi: 10.1371/journal.pgen.1003649. Epub 2013 Aug 1.
2
Effect of normalization on statistical and biological interpretation of gene expression profiles.
Front Genet. 2013 May 31;3:160. doi: 10.3389/fgene.2012.00160. eCollection 2012.
3
Using blood informative transcripts in geographical genomics: impact of lifestyle on gene expression in fijians.
Front Genet. 2012 Nov 9;3:243. doi: 10.3389/fgene.2012.00243. eCollection 2012.
4
Genetic association analysis of complex diseases incorporating intermediate phenotype information.
PLoS One. 2012;7(10):e46612. doi: 10.1371/journal.pone.0046612. Epub 2012 Oct 19.
5
Mapping cis- and trans-regulatory effects across multiple tissues in twins.
Nat Genet. 2012 Oct;44(10):1084-9. doi: 10.1038/ng.2394. Epub 2012 Sep 2.
7
A mutation in APP protects against Alzheimer's disease and age-related cognitive decline.
Nature. 2012 Aug 2;488(7409):96-9. doi: 10.1038/nature11283.
8
Extent, causes, and consequences of small RNA expression variation in human adipose tissue.
PLoS Genet. 2012;8(5):e1002704. doi: 10.1371/journal.pgen.1002704. Epub 2012 May 10.
9
The Brisbane Systems Genetics Study: genetical genomics meets complex trait genetics.
PLoS One. 2012;7(4):e35430. doi: 10.1371/journal.pone.0035430. Epub 2012 Apr 26.
10
Patterns of cis regulatory variation in diverse human populations.
PLoS Genet. 2012;8(4):e1002639. doi: 10.1371/journal.pgen.1002639. Epub 2012 Apr 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验