Suppr超能文献

特征基因组关联研究(EigenGWAS):通过对结构化群体中特征向量进行全基因组关联研究来寻找受选择的基因座。

EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations.

作者信息

Chen G-B, Lee S H, Zhu Z-X, Benyamin B, Robinson M R

机构信息

Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, Australia.

School of Environmental and Rural Science, The University of New England, Armidale, New South Wales, Australia.

出版信息

Heredity (Edinb). 2016 Jul;117(1):51-61. doi: 10.1038/hdy.2016.25. Epub 2016 May 4.

Abstract

We develop a novel approach to identify regions of the genome underlying population genetic differentiation in any genetic data where the underlying population structure is unknown, or where the interest is assessing divergence along a gradient. By combining the statistical framework for genome-wide association studies (GWASs) with eigenvector decomposition (EigenGWAS), which is commonly used in population genetics to characterize the structure of genetic data, loci under selection can be identified without a requirement for discrete populations. We show through theory and simulation that our approach can identify regions under selection along gradients of ancestry, and in real data we confirm this by demonstrating LCT to be under selection between HapMap CEU-TSI cohorts, and we then validate this selection signal across European countries in the POPRES samples. HERC2 was also found to be differentiated between both the CEU-TSI cohort and within the POPRES sample, reflecting the likely anthropological differences in skin and hair colour between northern and southern European populations. Controlling for population stratification is of great importance in any quantitative genetic study and our approach also provides a simple, fast and accurate way of predicting principal components in independent samples. With ever increasing sample sizes across many fields, this approach is likely to be greatly utilized to gain individual-level eigenvectors avoiding the computational challenges associated with conducting singular value decomposition in large data sets. We have developed freely available software, Genetic Analysis Repository (GEAR), to facilitate the application of the methods.

摘要

我们开发了一种新方法,可在潜在群体结构未知或关注沿梯度评估分化的任何遗传数据中,识别基因组中构成群体遗传分化基础的区域。通过将全基因组关联研究(GWAS)的统计框架与特征向量分解(EigenGWAS)相结合(EigenGWAS常用于群体遗传学中表征遗传数据的结构),无需离散群体即可识别受选择的基因座。我们通过理论和模拟表明,我们的方法可以识别沿祖先梯度受选择的区域,在实际数据中,我们通过证明乳糖酶(LCT)在HapMap CEU - TSI队列之间受到选择来证实这一点,然后我们在POPRES样本中验证了欧洲各国之间的这种选择信号。还发现HERC2在CEU - TSI队列之间以及在POPRES样本内部存在差异,这反映了北欧和南欧人群在皮肤和头发颜色方面可能存在的人类学差异。在任何定量遗传研究中,控制群体分层都非常重要,我们的方法还提供了一种简单、快速且准确的方法来预测独立样本中的主成分。随着许多领域样本量的不断增加,这种方法可能会被大量使用,以获得个体水平的特征向量,避免在大数据集中进行奇异值分解所带来的计算挑战。我们开发了免费软件“遗传分析库”(GEAR),以促进这些方法的应用。

相似文献

1
EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations.
Heredity (Edinb). 2016 Jul;117(1):51-61. doi: 10.1038/hdy.2016.25. Epub 2016 May 4.
3
A method for identifying haplotypes carrying the causative allele in positive natural selection and genome-wide association studies.
Bioinformatics. 2011 Mar 15;27(6):822-8. doi: 10.1093/bioinformatics/btr007. Epub 2011 Jan 6.
4
Using genome scans of DNA polymorphism to infer adaptive population divergence.
Mol Ecol. 2005 Mar;14(3):671-88. doi: 10.1111/j.1365-294X.2005.02437.x.
5
Genomic consequences of selection and genome-wide association mapping in soybean.
BMC Genomics. 2015 Sep 3;16(1):671. doi: 10.1186/s12864-015-1872-y.
7
Genome Scan for Selection in Structured Layer Chicken Populations Exploiting Linkage Disequilibrium Information.
PLoS One. 2015 Jul 7;10(7):e0130497. doi: 10.1371/journal.pone.0130497. eCollection 2015.
8
EigenGWAS: An online visualizing and interactive application for detecting genomic signatures of natural selection.
Mol Ecol Resour. 2021 Jul;21(5):1732-1744. doi: 10.1111/1755-0998.13370. Epub 2021 Mar 17.
10
An Equation to Predict the Accuracy of Genomic Values by Combining Data from Multiple Traits, Populations, or Environments.
Genetics. 2016 Feb;202(2):799-823. doi: 10.1534/genetics.115.183269. Epub 2015 Dec 4.

引用本文的文献

1
Semi-supervised detection of natural selection with positive-unlabeled learning.
bioRxiv. 2025 Aug 18:2025.08.15.670602. doi: 10.1101/2025.08.15.670602.
2
Comprehensive genomic and phenotypic analyses reveal the genetic basis of fruit quality in litchi.
Genome Biol. 2025 Jul 24;26(1):222. doi: 10.1186/s13059-025-03693-5.
3
Integrating parental genomes to reduce reference bias and identify intramuscular fat genes in Qinchuan Black pigs.
J Anim Sci Biotechnol. 2025 Jul 20;16(1):104. doi: 10.1186/s40104-025-01236-3.
4
Precise estimation of in-depth relatedness in biobank-scale datasets using deepKin.
Cell Rep Methods. 2025 Jun 16;5(6):101053. doi: 10.1016/j.crmeth.2025.101053. Epub 2025 May 27.
8
Genomic basis of selective breeding from the closest wild relative of large-fruited tomato.
Hortic Res. 2023 Jul 8;10(8):uhad142. doi: 10.1093/hr/uhad142. eCollection 2023 Aug.
10
Inflammation and immunity connect hypertension with adverse COVID-19 outcomes.
Front Genet. 2022 Sep 8;13:933148. doi: 10.3389/fgene.2022.933148. eCollection 2022.

本文引用的文献

1
Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia.
Am J Hum Genet. 2016 Mar 3;98(3):456-472. doi: 10.1016/j.ajhg.2015.12.022. Epub 2016 Feb 25.
3
4
Separation of the largest eigenvalues in eigenanalysis of genotype data from discrete subpopulations.
Theor Popul Biol. 2013 Nov;89:34-43. doi: 10.1016/j.tpb.2013.08.004. Epub 2013 Aug 20.
5
Mutations in DARS cause hypomyelination with brain stem and spinal cord involvement and leg spasticity.
Am J Hum Genet. 2013 May 2;92(5):774-80. doi: 10.1016/j.ajhg.2013.04.006.
6
Power and predictive accuracy of polygenic risk scores.
PLoS Genet. 2013 Mar;9(3):e1003348. doi: 10.1371/journal.pgen.1003348. Epub 2013 Mar 21.
7
Improved ancestry inference using weights from external reference panels.
Bioinformatics. 2013 Jun 1;29(11):1399-406. doi: 10.1093/bioinformatics/btt144. Epub 2013 Mar 28.
8
An integrated map of genetic variation from 1,092 human genomes.
Nature. 2012 Nov 1;491(7422):56-65. doi: 10.1038/nature11632.
9
Phasing of many thousands of genotyped samples.
Am J Hum Genet. 2012 Aug 10;91(2):238-51. doi: 10.1016/j.ajhg.2012.06.013.
10
Genotype imputation with thousands of genomes.
G3 (Bethesda). 2011 Nov;1(6):457-70. doi: 10.1534/g3.111.001198. Epub 2011 Nov 1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验