遗传和非遗传变异揭示了人类基因表达的主要成分。

Genetic and nongenetic variation revealed for the principal components of human gene expression.

机构信息

University of Queensland Diamantina Institute, The Translational Research Institute, Brisbane, Queensland 4102, Australia.

出版信息

Genetics. 2013 Nov;195(3):1117-28. doi: 10.1534/genetics.113.153221. Epub 2013 Sep 11.

DOI:10.1534/genetics.113.153221

PMID:24026092

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3813841/

Abstract

Principal components analysis has been employed in gene expression studies to correct for population substructure and batch and environmental effects. This method typically involves the removal of variation contained in as many as 50 principal components (PCs), which can constitute a large proportion of total variation present in the data. Each PC, however, can detect many sources of variation, including gene expression networks and genetic variation influencing transcript levels. We demonstrate that PCs generated from gene expression data can simultaneously contain both genetic and nongenetic factors. From heritability estimates we show that all PCs contain a considerable portion of genetic variation while nongenetic artifacts such as batch effects were associated to varying degrees with the first 60 PCs. These PCs demonstrate an enrichment of biological pathways, including core immune function and metabolic pathways. The use of PC correction in two independent data sets resulted in a reduction in the number of cis- and trans-expression QTL detected. Comparisons of PC and linear model correction revealed that PC correction was not as efficient at removing known batch effects and had a higher penalty on genetic variation. Therefore, this study highlights the danger of eliminating biologically relevant data when employing PC correction in gene expression data.

摘要

主成分分析已被应用于基因表达研究中，以校正群体亚结构、批次和环境效应。该方法通常涉及去除多达 50 个主成分（PC）中的变异，这些变异可能构成数据中总变异的很大一部分。然而，每个 PC 都可以检测到许多变异来源，包括基因表达网络和影响转录水平的遗传变异。我们证明了从基因表达数据中生成的 PC 可以同时包含遗传和非遗传因素。从遗传力估计中我们可以看出，所有的 PC 都包含了相当一部分的遗传变异，而批次效应等非遗传伪影则与前 60 个 PC 有不同程度的关联。这些 PC 显示出生物途径的富集，包括核心免疫功能和代谢途径。在两个独立的数据集中使用 PC 校正导致 cis 和 trans 表达 QTL 的数量减少。PC 校正和线性模型校正的比较表明，PC 校正去除已知批次效应的效率不如线性模型校正，并且对遗传变异的惩罚更高。因此，本研究强调了在基因表达数据中使用 PC 校正时消除生物学相关数据的危险。

相似文献

Genetic and nongenetic variation revealed for the principal components of human gene expression.遗传和非遗传变异揭示了人类基因表达的主要成分。

Genetics. 2013 Nov;195(3):1117-28. doi: 10.1534/genetics.113.153221. Epub 2013 Sep 11.

How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强？

Pac Symp Biocomput. 2018;23:228-239.

Identifying the genetic variation of gene expression using gene sets: application of novel gene Set eQTL approach to PharmGKB and KEGG.利用基因集识别基因表达的遗传变异：新型基因集 eQTL 方法在 PharmGKB 和 KEGG 中的应用。

PLoS One. 2012;7(8):e43301. doi: 10.1371/journal.pone.0043301. Epub 2012 Aug 14.

On the substructure controls in rare variant analysis: Principal components or variance components?关于罕见变异分析中的亚结构控制：主成分还是方差成分？

Genet Epidemiol. 2018 Apr;42(3):276-287. doi: 10.1002/gepi.22102. Epub 2017 Dec 26.

Accuracy of heritability estimations in presence of hidden population stratification.存在隐藏人群分层时遗传力估计的准确性。

Sci Rep. 2016 May 25;6:26471. doi: 10.1038/srep26471.

Exploring pleiotropy using principal components.使用主成分探索多效性。

BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S53. doi: 10.1186/1471-2156-4-S1-S53.

Principal component regression and linear mixed model in association analysis of structured samples: competitors or complements?结构化样本关联分析中的主成分回归与线性混合模型：竞争对手还是互补方法？

Genet Epidemiol. 2015 Mar;39(3):149-55. doi: 10.1002/gepi.21879. Epub 2014 Dec 23.

Comparison of three methods for obtaining principal components from family data in genetic analysis of complex disease.复杂疾病遗传分析中从家系数据获取主成分的三种方法比较

Genet Epidemiol. 2001;21 Suppl 1:S726-31. doi: 10.1002/gepi.2001.21.s1.s726.

Mapping genomic regions affecting milk traits in Sarda sheep by using the OvineSNP50 Beadchip and principal components to perform combined linkage and linkage disequilibrium analysis.利用 OvineSNP50 Beadchip 对影响撒丁岛绵羊乳性状的基因组区域进行定位，并采用主成分进行连锁与连锁不平衡的联合分析。

Genet Sel Evol. 2019 Nov 19;51(1):65. doi: 10.1186/s12711-019-0508-0.

A ridge penalized principal-components approach based on heritability for high-dimensional data.一种基于遗传力的高维数据岭罚主成分分析方法。

Hum Hered. 2007;64(3):182-91. doi: 10.1159/000102991. Epub 2007 May 25.

引用本文的文献

Trans Effects on Gene Expression Can Drive Omnigenic Inheritance.转录效应对基因表达的影响可驱动全基因组遗传。

Cell. 2019 May 2;177(4):1022-1034.e6. doi: 10.1016/j.cell.2019.04.014.

An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci.一种用于识别广泛影响的表达数量性状基因座的独立成分分析混杂因素校正框架。

PLoS Comput Biol. 2017 May 15;13(5):e1005537. doi: 10.1371/journal.pcbi.1005537. eCollection 2017 May.

Autosomal genetic control of human gene expression does not differ across the sexes.人类基因表达的常染色体遗传控制在性别之间没有差异。

Genome Biol. 2016 Dec 1;17(1):248. doi: 10.1186/s13059-016-1111-0.

Population structure of Han Chinese in the modern Taiwanese population based on 10,000 participants in the Taiwan Biobank project.基于台湾生物银行项目的10000名参与者探究现代台湾人口中汉族的群体结构。

Hum Mol Genet. 2016 Dec 15;25(24):5321-5331. doi: 10.1093/hmg/ddw346.

Modelling local gene networks increases power to detect trans-acting genetic effects on gene expression.构建局部基因网络可增强检测基因表达中反式作用遗传效应的能力。

Genome Biol. 2016 Feb 24;17:33. doi: 10.1186/s13059-016-0895-2.

Seasonal effects on gene expression.季节对基因表达的影响。

PLoS One. 2015 May 29;10(5):e0126995. doi: 10.1371/journal.pone.0126995. eCollection 2015.

Evaluating intra- and inter-individual variation in the human placental transcriptome.评估人类胎盘转录组的个体内和个体间变异。

Genome Biol. 2015 Mar 19;16(1):54. doi: 10.1186/s13059-015-0627-z.

The role of regulatory variation in complex traits and disease.调控变异在复杂性状和疾病中的作用。

Nat Rev Genet. 2015 Apr;16(4):197-212. doi: 10.1038/nrg3891. Epub 2015 Feb 24.

LINE-1 methylation in granulocyte DNA and trihalomethane exposure is associated with bladder cancer risk.粒细胞DNA中的LINE-1甲基化与三卤甲烷暴露和膀胱癌风险相关。

Epigenetics. 2014 Nov;9(11):1532-9. doi: 10.4161/15592294.2014.983377.

Regularized machine learning in the genetic prediction of complex traits.复杂性状遗传预测中的正则化机器学习

PLoS Genet. 2014 Nov 13;10(11):e1004754. doi: 10.1371/journal.pgen.1004754. eCollection 2014 Nov.

本文引用的文献

Integrative modeling of eQTLs and cis-regulatory elements suggests mechanisms underlying cell type specificity of eQTLs.整合 eQTL 和顺式调控元件的建模提示了 eQTL 细胞类型特异性的潜在机制。

PLoS Genet. 2013;9(8):e1003649. doi: 10.1371/journal.pgen.1003649. Epub 2013 Aug 1.

Effect of normalization on statistical and biological interpretation of gene expression profiles.标准化对基因表达谱的统计和生物学解释的影响。

Front Genet. 2013 May 31;3:160. doi: 10.3389/fgene.2012.00160. eCollection 2012.

Using blood informative transcripts in geographical genomics: impact of lifestyle on gene expression in fijians.利用血液信息转录本进行地理基因组学研究：生活方式对斐济人基因表达的影响。

Front Genet. 2012 Nov 9;3:243. doi: 10.3389/fgene.2012.00243. eCollection 2012.

Genetic association analysis of complex diseases incorporating intermediate phenotype information.结合中间表型信息的复杂疾病的遗传关联分析。

PLoS One. 2012;7(10):e46612. doi: 10.1371/journal.pone.0046612. Epub 2012 Oct 19.

Mapping cis- and trans-regulatory effects across multiple tissues in twins.在双胞胎的多个组织中映射顺式和反式调控作用。

Nat Genet. 2012 Oct;44(10):1084-9. doi: 10.1038/ng.2394. Epub 2012 Sep 2.

PLoS One. 2012;7(8):e43301. doi: 10.1371/journal.pone.0043301. Epub 2012 Aug 14.

A mutation in APP protects against Alzheimer's disease and age-related cognitive decline.APP 中的一个突变可预防阿尔茨海默病和与年龄相关的认知能力下降。

Nature. 2012 Aug 2;488(7409):96-9. doi: 10.1038/nature11283.

Extent, causes, and consequences of small RNA expression variation in human adipose tissue.人类脂肪组织中小 RNA 表达变化的程度、原因和后果。

PLoS Genet. 2012;8(5):e1002704. doi: 10.1371/journal.pgen.1002704. Epub 2012 May 10.

The Brisbane Systems Genetics Study: genetical genomics meets complex trait genetics.布里斯班系统遗传学研究：基因基因组学与复杂性状遗传学相遇。

PLoS One. 2012;7(4):e35430. doi: 10.1371/journal.pone.0035430. Epub 2012 Apr 26.

Patterns of cis regulatory variation in diverse human populations.不同人类群体中顺式调控变异的模式。

PLoS Genet. 2012;8(4):e1002639. doi: 10.1371/journal.pgen.1002639. Epub 2012 Apr 19.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验