Suppr超能文献

遗传和非遗传变异揭示了人类基因表达的主要成分。

Genetic and nongenetic variation revealed for the principal components of human gene expression.

机构信息

University of Queensland Diamantina Institute, The Translational Research Institute, Brisbane, Queensland 4102, Australia.

出版信息

Genetics. 2013 Nov;195(3):1117-28. doi: 10.1534/genetics.113.153221. Epub 2013 Sep 11.

Abstract

Principal components analysis has been employed in gene expression studies to correct for population substructure and batch and environmental effects. This method typically involves the removal of variation contained in as many as 50 principal components (PCs), which can constitute a large proportion of total variation present in the data. Each PC, however, can detect many sources of variation, including gene expression networks and genetic variation influencing transcript levels. We demonstrate that PCs generated from gene expression data can simultaneously contain both genetic and nongenetic factors. From heritability estimates we show that all PCs contain a considerable portion of genetic variation while nongenetic artifacts such as batch effects were associated to varying degrees with the first 60 PCs. These PCs demonstrate an enrichment of biological pathways, including core immune function and metabolic pathways. The use of PC correction in two independent data sets resulted in a reduction in the number of cis- and trans-expression QTL detected. Comparisons of PC and linear model correction revealed that PC correction was not as efficient at removing known batch effects and had a higher penalty on genetic variation. Therefore, this study highlights the danger of eliminating biologically relevant data when employing PC correction in gene expression data.

摘要

主成分分析已被应用于基因表达研究中,以校正群体亚结构、批次和环境效应。该方法通常涉及去除多达 50 个主成分(PC)中的变异,这些变异可能构成数据中总变异的很大一部分。然而,每个 PC 都可以检测到许多变异来源,包括基因表达网络和影响转录水平的遗传变异。我们证明了从基因表达数据中生成的 PC 可以同时包含遗传和非遗传因素。从遗传力估计中我们可以看出,所有的 PC 都包含了相当一部分的遗传变异,而批次效应等非遗传伪影则与前 60 个 PC 有不同程度的关联。这些 PC 显示出生物途径的富集,包括核心免疫功能和代谢途径。在两个独立的数据集中使用 PC 校正导致 cis 和 trans 表达 QTL 的数量减少。PC 校正和线性模型校正的比较表明,PC 校正去除已知批次效应的效率不如线性模型校正,并且对遗传变异的惩罚更高。因此,本研究强调了在基因表达数据中使用 PC 校正时消除生物学相关数据的危险。

相似文献

6
Exploring pleiotropy using principal components.使用主成分探索多效性。
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S53. doi: 10.1186/1471-2156-4-S1-S53.

引用本文的文献

6
Seasonal effects on gene expression.季节对基因表达的影响。
PLoS One. 2015 May 29;10(5):e0126995. doi: 10.1371/journal.pone.0126995. eCollection 2015.
8
The role of regulatory variation in complex traits and disease.调控变异在复杂性状和疾病中的作用。
Nat Rev Genet. 2015 Apr;16(4):197-212. doi: 10.1038/nrg3891. Epub 2015 Feb 24.
10
Regularized machine learning in the genetic prediction of complex traits.复杂性状遗传预测中的正则化机器学习
PLoS Genet. 2014 Nov 13;10(11):e1004754. doi: 10.1371/journal.pgen.1004754. eCollection 2014 Nov.

本文引用的文献

10
Patterns of cis regulatory variation in diverse human populations.不同人类群体中顺式调控变异的模式。
PLoS Genet. 2012;8(4):e1002639. doi: 10.1371/journal.pgen.1002639. Epub 2012 Apr 19.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验