Davies Robert W, Dandona Sonny, Stewart Alexandre F R, Chen Li, Ellis Stephan G, Tang W H Wilson, Hazen Stanley L, Roberts Robert, McPherson Ruth, Wells George A
Cardiovascular Research Methods Centre, University of Ottawa Heart Institute, Ontario, Canada.
Circ Cardiovasc Genet. 2010 Oct;3(5):468-74. doi: 10.1161/CIRCGENETICS.110.946269. Epub 2010 Aug 21.
Genome-wide association studies (GWAS) have identified single-nucleotide polymorphisms (SNPs) at multiple loci that are significantly associated with coronary artery disease (CAD) risk. In this study, we sought to determine and compare the predictive capabilities of 9p21.3 alone and a panel of SNPs identified and replicated through GWAS for CAD.
We used the Ottawa Heart Genomics Study (OHGS) (3323 cases, 2319 control subjects) and the Wellcome Trust Case Control Consortium (WTCCC) (1926 cases, 2938 control subjects) data sets. We compared the ability of allele counting, logistic regression, and support vector machines. Two sets of SNPs, 9p21.3 alone and a set of 12 SNPs identified by GWAS and through a model-fitting procedure, were considered. Performance was assessed by measuring area under the curve (AUC) for OHGS using 10-fold cross-validation and WTCCC as a replication set. AUC for logistic regression using OHGS increased significantly from 0.555 to 0.608 (P=3.59×10⁻¹⁴) for 9p21.3 versus the 12 SNPs, respectively. This difference remained when traditional risk factors were considered in a subgroup of OHGS (1388 cases, 2038 control subjects), with AUC increasing from 0.804 to 0.809 (P=0.037). The added predictive value over and above the traditional risk factors was not significant for 9p21.3 (AUC 0.801 versus 0.804, P=0.097) but was for the 12 SNPs (AUC 0.801 versus 0.809, P=0.0073). Performance was similar between OHGS and WTCCC. Logistic regression outperformed both support vector machines and allele counting.
Using the collective of 12 SNPs confers significantly greater predictive capabilities for CAD than 9p21.3, whether traditional risks are or are not considered. More accurate models probably will evolve as additional CAD-associated SNPs are identified.
全基因组关联研究(GWAS)已在多个位点鉴定出与冠状动脉疾病(CAD)风险显著相关的单核苷酸多态性(SNP)。在本研究中,我们试图确定并比较单独的9p21.3以及一组通过GWAS鉴定并重复验证的SNP对CAD的预测能力。
我们使用了渥太华心脏基因组学研究(OHGS)(3323例病例,2319例对照)和威康信托病例对照联合体(WTCCC)(1926例病例,2938例对照)的数据集。我们比较了等位基因计数、逻辑回归和支持向量机的能力。考虑了两组SNP,单独的9p21.3以及一组通过GWAS并通过模型拟合程序鉴定出的12个SNP。使用10倍交叉验证对OHGS测量曲线下面积(AUC)并将WTCCC作为重复集来评估性能。对于OHGS,逻辑回归的AUC分别从9p21.3的0.555显著增加到12个SNP的0.608(P = 3.59×10⁻¹⁴)。在OHGS的一个亚组(1388例病例,2038例对照)中考虑传统风险因素时,这种差异仍然存在,AUC从0.804增加到0.809(P = 0.037)。9p21.3在传统风险因素之上的额外预测价值不显著(AUC为0.801对0.804,P = 0.097),但12个SNP的额外预测价值显著(AUC为0.801对0.809,P = 0.0073)。OHGS和WTCCC之间的性能相似。逻辑回归的表现优于支持向量机和等位基因计数。
无论是否考虑传统风险,使用这12个SNP的组合对CAD的预测能力都显著高于9p21.3。随着更多与CAD相关的SNP被鉴定出来,可能会发展出更准确的模型。