Setsirichok Damrongrit, Tienboon Phuwadej, Jaroonruang Nattapong, Kittichaijaroen Somkit, Wongseree Waranyu, Piroonratana Theera, Usavanarong Touchpong, Limwongse Chanin, Aporntewan Chatchawit, Phadoongsidhi Marong, Chaiyaratana Nachol
Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand.
Department of Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology Thonburi, 126 Pracha-utid Road, Bangmod, Toongkru, Bangkok 10140, Thailand.
Springerplus. 2013 May 19;2:230. doi: 10.1186/2193-1801-2-230. eCollection 2013.
This article presents the ability of an omnibus permutation test on ensembles of two-locus analyses (2LOmb) to detect pure epistasis in the presence of genetic heterogeneity. The performance of 2LOmb is evaluated in various simulation scenarios covering two independent causes of complex disease where each cause is governed by a purely epistatic interaction. Different scenarios are set up by varying the number of available single nucleotide polymorphisms (SNPs) in data, number of causative SNPs and ratio of case samples from two affected groups. The simulation results indicate that 2LOmb outperforms multifactor dimensionality reduction (MDR) and random forest (RF) techniques in terms of a low number of output SNPs and a high number of correctly-identified causative SNPs. Moreover, 2LOmb is capable of identifying the number of independent interactions in tractable computational time and can be used in genome-wide association studies. 2LOmb is subsequently applied to a type 1 diabetes mellitus (T1D) data set, which is collected from a UK population by the Wellcome Trust Case Control Consortium (WTCCC). After screening for SNPs that locate within or near genes and exhibit no marginal single-locus effects, the T1D data set is reduced to 95,991 SNPs from 12,146 genes. The 2LOmb search in the reduced T1D data set reveals that 12 SNPs, which can be divided into two independent sets, are associated with the disease. The first SNP set consists of three SNPs from MUC21 (mucin 21, cell surface associated), three SNPs from MUC22 (mucin 22), two SNPs from PSORS1C1 (psoriasis susceptibility 1 candidate 1) and one SNP from TCF19 (transcription factor 19). A four-locus interaction between these four genes is also detected. The second SNP set consists of three SNPs from ATAD1 (ATPase family, AAA domain containing 1). Overall, the findings indicate the detection of pure epistasis in the presence of genetic heterogeneity and provide an alternative explanation for the aetiology of T1D in the UK population.
本文介绍了一种对双基因座分析集合(2LOmb)进行综合排列检验,在存在遗传异质性的情况下检测纯上位性的能力。在各种模拟场景中评估了2LOmb的性能,这些场景涵盖了复杂疾病的两个独立病因,每个病因都由纯上位性相互作用控制。通过改变数据中可用单核苷酸多态性(SNP)的数量、致病SNP的数量以及两个受影响群体的病例样本比例来设置不同的场景。模拟结果表明,在输出SNP数量少和正确识别的致病SNP数量多方面,2LOmb优于多因素降维(MDR)和随机森林(RF)技术。此外,2LOmb能够在可控的计算时间内识别独立相互作用的数量,可用于全基因组关联研究。随后将2LOmb应用于1型糖尿病(T1D)数据集,该数据集由惠康信托病例对照协会(WTCCC)从英国人群中收集。在筛选出位于基因内部或附近且无边缘单基因座效应的SNP后,T1D数据集从12146个基因减少到95991个SNP。在减少后的T1D数据集中进行的2LOmb搜索显示,12个SNP与疾病相关,这些SNP可分为两个独立的集合。第一个SNP集合由来自MUC21(粘蛋白21,细胞表面相关)的3个SNP、来自MUC22(粘蛋白22)的3个SNP、来自PSORS1C1(银屑病易感性1候选基因1)的2个SNP和来自TCF19(转录因子19)的1个SNP组成。还检测到这四个基因之间的四基因座相互作用。第二个SNP集合由来自ATAD1(含ATP酶家族AAA结构域1)的3个SNP组成。总体而言,这些发现表明在存在遗传异质性的情况下检测到了纯上位性,并为英国人群中T1D的病因提供了另一种解释。