Suppr超能文献

全基因组关联研究中的群体亚结构与对照选择

Population substructure and control selection in genome-wide association studies.

作者信息

Yu Kai, Wang Zhaoming, Li Qizhai, Wacholder Sholom, Hunter David J, Hoover Robert N, Chanock Stephen, Thomas Gilles

机构信息

Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, United States of America.

出版信息

PLoS One. 2008 Jul 2;3(7):e2551. doi: 10.1371/journal.pone.0002551.

Abstract

Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor lambda of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (lambda of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r(2)<0.004) was selected to infer population substructure with principal component analysis. A novel permutation procedure was developed for the correction of PS that identified a smaller set of principal components and achieved a better control of type I error (to lambda of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed.

摘要

确定严格的经典流行病学标准对于对照选择的相关性以及对群体分层(PS)进行稳健处理,是全基因组关联研究(GWAS)设计和分析中的一项重大挑战。癌症易感性基因标记(CGEMS)项目中两项针对欧裔美国人的GWAS的经验数据,被用于评估PS在采用不同对照选择策略的研究中的影响。在嵌套于相应前瞻性队列的两项原始病例对照研究中,均观察到因PS产生的轻微混杂效应(膨胀因子λ分别为1.025和1.005)。相比之下,当交换对照组以模拟一种具有成本效益但理论上不太理想的对照选择策略时,混杂效应更大(λ分别为1.090和1.062)。选择了一组在Illumina和Affymetrix商业平台上均存在的、具有低局部背景连锁不平衡(成对r²<0.004)的12,898个常染色体单核苷酸多态性(SNP),通过主成分分析来推断群体亚结构。开发了一种新的置换程序用于校正PS,该程序识别出一组较小的主成分,并且比目前使用的方法能更好地控制I型错误(分别将λ控制到1.032和1.006)。基于新检验和未进行PS校正的检验,p值最低的5%的SNP集合之间的重叠率约为80%,大多数不一致的SNP的排名都接近阈值。因此,对于在欧裔美国人中进行的前列腺癌和乳腺癌的CGEMS GWAS,在设计良好的研究中,PS似乎不是一个主要问题。当采用有效的PS校正策略时,使用次优对照的研究可以具有可接受的I型错误。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0640/2432498/b6aff2140b5e/pone.0002551.g001.jpg

相似文献

1
Population substructure and control selection in genome-wide association studies.
PLoS One. 2008 Jul 2;3(7):e2551. doi: 10.1371/journal.pone.0002551.
4
Sparse principal component analysis for identifying ancestry-informative markers in genome-wide association studies.
Genet Epidemiol. 2012 May;36(4):293-302. doi: 10.1002/gepi.21621. Epub 2012 Apr 16.
5
Principal-component analysis for assessment of population stratification in mitochondrial medical genetics.
Am J Hum Genet. 2010 Jun 11;86(6):904-17. doi: 10.1016/j.ajhg.2010.05.005. Epub 2010 May 27.
10
SNP-based pathway enrichment analysis for genome-wide association studies.
BMC Bioinformatics. 2011 Apr 15;12:99. doi: 10.1186/1471-2105-12-99.

引用本文的文献

1
Adjusting for principal components can induce collider bias in genome-wide association studies.
PLoS Genet. 2024 Dec 16;20(12):e1011242. doi: 10.1371/journal.pgen.1011242. eCollection 2024 Dec.
4
Population stratification correction using Bayesian shrinkage priors for genetic association studies.
Ann Hum Genet. 2023 Nov;87(6):302-315. doi: 10.1111/ahg.12527. Epub 2023 Sep 28.
5
Genetically predicted telomere length is associated with clonal somatic copy number alterations in peripheral leukocytes.
PLoS Genet. 2020 Oct 22;16(10):e1009078. doi: 10.1371/journal.pgen.1009078. eCollection 2020 Oct.
6
7
PCAmatchR: a flexible R package for optimal case-control matching using weighted principal components.
Bioinformatics. 2021 May 23;37(8):1178-1181. doi: 10.1093/bioinformatics/btaa784.
8
Low-frequency variation near common germline susceptibility loci are associated with risk of Ewing sarcoma.
PLoS One. 2020 Sep 3;15(9):e0237792. doi: 10.1371/journal.pone.0237792. eCollection 2020.
9
Inherited genetic susceptibility to acute lymphoblastic leukemia in Down syndrome.
Blood. 2019 Oct 10;134(15):1227-1237. doi: 10.1182/blood.2018890764.
10
A Powerful Method To Test Associations Between Ordinal Traits and Genotypes.
G3 (Bethesda). 2019 Aug 8;9(8):2573-2579. doi: 10.1534/g3.119.400293.

本文引用的文献

1
Multiple loci identified in a genome-wide association study of prostate cancer.
Nat Genet. 2008 Mar;40(3):310-5. doi: 10.1038/ng.91. Epub 2008 Feb 10.
2
Analysis and application of European genetic substructure using 300 K SNP information.
PLoS Genet. 2008 Jan;4(1):e4. doi: 10.1371/journal.pgen.0040004.
3
Discerning the ancestry of European Americans in genetic association studies.
PLoS Genet. 2008 Jan;4(1):e236. doi: 10.1371/journal.pgen.0030236. Epub 2007 Nov 19.
7
Scanning the horizon: what is the future of genome-wide association studies in accelerating discoveries in cancer etiology and prevention?
Cancer Causes Control. 2007 Jun;18(5):479-84. doi: 10.1007/s10552-007-0118-y. Epub 2007 Apr 17.
8
A simple and improved correction for population stratification in case-control studies.
Am J Hum Genet. 2007 May;80(5):921-30. doi: 10.1086/516842. Epub 2007 Mar 29.
9
Genome-wide association study of prostate cancer identifies a second risk locus at 8q24.
Nat Genet. 2007 May;39(5):645-9. doi: 10.1038/ng2022. Epub 2007 Apr 1.
10
Population structure and eigenanalysis.
PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验