Amador Carmen, Huffman Jennifer, Trochet Holly, Campbell Archie, Porteous David, Wilson James F, Hastie Nick, Vitart Veronique, Hayward Caroline, Navarro Pau, Haley Chris S
MRC IGMM, University of Edinburgh, Edinburgh, EH4 2XU, UK.
Centre for Population Health Sciences, University of Edinburgh, Edinburgh, EH8 9AG, UK.
BMC Genomics. 2015 Jun 6;16(1):437. doi: 10.1186/s12864-015-1605-2.
The Generation Scotland Scottish Family Health Study (GS:SFHS) includes 23,960 participants from across Scotland with records for many health-related traits and environmental covariates. Genotypes at ~700 K SNPs are currently available for 10,000 participants. The cohort was designed as a resource for genetic and health related research and the study of complex traits. In this study we developed a suite of analyses to disentangle the genomic differentiation within GS:SFHS individuals to describe and optimise the sample and methods for future analyses.
We combined the genotypic information of GS:SFHS with 1092 individuals from the 1000 Genomes project and estimated their genomic relationships. Then, we performed Principal Component Analyses of the resulting relationships to investigate the genomic origin of different groups. We characterised two groups of individuals: those with a few sparse rare markers in the genome, and those with several large rare haplotypes which might represent relatively recent exogenous ancestors. We identified some individuals with likely Italian ancestry and a group with some potential African/Asian ancestry. An analysis of homozygosity in the GS:SFHS sample revealed a very similar pattern to other European populations. We also identified an individual carrying a chromosome 1 uniparental disomy. We found evidence of local geographic stratification within the population having impact on the genomic structure.
These findings illuminate the history of the Scottish population and have implications for further analyses such as the study of the contributions of common and rare variants to trait heritabilities and the evaluation of genomic and phenotypic prediction of disease.
苏格兰家族健康研究(GS:SFHS)纳入了来自苏格兰各地的23960名参与者,他们拥有许多与健康相关的性状记录和环境协变量。目前,约10000名参与者拥有约70万个单核苷酸多态性(SNP)的基因型数据。该队列被设计为一个用于遗传和健康相关研究以及复杂性状研究的资源。在本研究中,我们开发了一套分析方法,以理清GS:SFHS个体内部的基因组分化情况,从而描述并优化样本及方法,以便未来进行分析。
我们将GS:SFHS的基因型信息与来自千人基因组计划的1092名个体相结合,并估计了他们的基因组关系。然后,我们对所得关系进行主成分分析,以研究不同群体的基因组起源。我们对两组个体进行了特征描述:一组个体在基因组中具有少量稀疏的罕见标记,另一组个体具有几个较大罕见单倍型,这可能代表了相对较近的外来祖先。我们识别出一些可能具有意大利血统的个体以及一组可能具有非洲/亚洲血统的个体。对GS:SFHS样本的纯合性分析显示出与其他欧洲人群非常相似的模式。我们还识别出一名携带1号染色体单亲二体的个体。我们发现人群中存在局部地理分层现象,这对基因组结构产生了影响。
这些发现揭示了苏格兰人群的历史,并对进一步的分析具有启示意义,例如研究常见和罕见变异对性状遗传力的贡献以及评估疾病的基因组和表型预测。