Clinical and Molecular Epidemiology Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece.
PLoS One. 2007 Feb 7;2(2):e196. doi: 10.1371/journal.pone.0000196.
Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques.
METHODOLOGY/PRINCIPAL FINDINGS: Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I(2) = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets).
CONCLUSIONS/SIGNIFICANCE: Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies.
全基因组关联研究为鉴定调节复杂疾病易感性的常见遗传变异体提供了很大的希望。然而,对于检测小的遗传效应,单个研究可能力不从心。通过将全基因组数据集与荟萃分析技术相结合,可以提高效力。
方法/主要发现:可以组合使用单阶段和两阶段全基因组数据,并且有几种可能的策略。在两阶段框架中,我们考虑了(1)增强复制数据和(2)增强第一阶段数据的选项,然后,我们还考虑了(3)包括所有第一阶段和第二阶段数据的联合荟萃分析。使用来自两项帕金森病全基因组关联研究(三个数据集)的数据对这些策略进行了实证研究。在这三种策略中,我们从每个策略中进行的分析数量得出了 12、5 和 49 个单核苷酸多态性,这些多态性在常规统计显著性水平下显示出显著的相关性。在每种策略中,没有一个在经过保守调整后仍然显著。然而,有些可能值得进一步考虑:有 6 个 SNP 是通过至少 2 种策略确定的,有 3 个 SNP [位于第 3 号染色体上的 rs1000291、位于第 4 号染色体上的 rs2241743 和位于第 11 号染色体上的 rs3018626]是通过所有 3 种策略确定的,并且没有或最小的数据集之间异质性(I(2)分别为 0、0 和 15%)。分析主要受到不同数据集中测试多态性的不理想重叠的限制(例如,在两个第一级数据集中只有 31192 个共享多态性)。
结论/意义:荟萃分析可用于提高全基因组关联研究的效力并检查数据集之间的异质性。如果前瞻性设计尝试最大化基因分型平台的重叠并预测许多全基因组关联研究的数据组合,那么它们可能是最有效的。