Department of Statistics and Information Science, Dongguk University, Gyeongju 780-714, Republic of Korea.
Comput Math Methods Med. 2013;2013:340678. doi: 10.1155/2013/340678. Epub 2013 Sep 24.
One of main objectives of a genome-wide association study (GWAS) is to develop a prediction model for a binary clinical outcome using single-nucleotide polymorphisms (SNPs) which can be used for diagnostic and prognostic purposes and for better understanding of the relationship between the disease and SNPs. Penalized support vector machine (SVM) methods have been widely used toward this end. However, since investigators often ignore the genetic models of SNPs, a final model results in a loss of efficiency in prediction of the clinical outcome. In order to overcome this problem, we propose a two-stage method such that the the genetic models of each SNP are identified using the MAX test and then a prediction model is fitted using a penalized SVM method. We apply the proposed method to various penalized SVMs and compare the performance of SVMs using various penalty functions. The results from simulations and real GWAS data analysis show that the proposed method performs better than the prediction methods ignoring the genetic models in terms of prediction power and selectivity.
全基因组关联研究(GWAS)的主要目标之一是使用单核苷酸多态性(SNP)开发用于二分类临床结局的预测模型,该模型可用于诊断和预后目的,并有助于更好地理解疾病与 SNP 之间的关系。惩罚支持向量机(SVM)方法已被广泛用于实现这一目标。然而,由于研究人员经常忽略 SNP 的遗传模型,因此最终模型会导致临床结局预测效率降低。为了解决这个问题,我们提出了一种两阶段方法,即使用 MAX 检验来确定每个 SNP 的遗传模型,然后使用惩罚 SVM 方法来拟合预测模型。我们将提出的方法应用于各种惩罚 SVM,并比较了使用各种惩罚函数的 SVM 的性能。模拟和真实 GWAS 数据分析的结果表明,在所提出的方法中,考虑遗传模型的预测方法在预测能力和选择性方面优于忽略遗传模型的预测方法。