North B V, Curtis D, Cassell P G, Hitman G A, Sham P C
Academic Department of Psychiatry, Barts and The London Queen Mary's School of Medicine and Dentistry, London E1 1BB, UK.
Ann Hum Genet. 2003 Jul;67(Pt 4):348-56. doi: 10.1046/j.1469-1809.2003.00030.x.
Biallelic markers, such as single nucleotide polymorphisms (SNPs), provide greater information for localising disease loci when treated as multilocus haplotypes, but often haplotypes are not immediately available from multilocus genotypes in case-control studies. An artificial neural network allows investigation of association between disease phenotype and tightly linked markers without requiring haplotype phase and without modelling any evolutionary history for the disease-related haplotypes. The network assesses whether marker haplotypes differ between cases and controls to the extent that classification of disease status based on multi-marker genotypes is achievable. The network is "trained" to "recognise" affection status based on supplied marker genotypes, and then for each multi-marker genotype it produces outputs which aim to approximate the associated affection status. Next, the genotypes are permuted relative to affection status to produce many random datasets and the process of training and recording of outputs is repeated. The extent to which the ability to predict affection for the real dataset exceeds that for the random datasets measures the statistical significance of the association between multi-marker genotype and affection. This permutation test performs well with simulated case-control datasets, particularly when major gene effects are present. We have explored the effects of systematically varying different network parameters in order to identify their optimal values. We have applied the permutation test to 4 SNPs of the calpain 10 (CAPN10) gene typed in a case-control sample of subjects with type 2 diabetes, impaired glucose tolerance, and controls. We show that the neural network produces more highly significant evidence for association than do single marker tests corrected for the number of markers genotyped. The use of a permutation test could potentially allow conditional analyses which could incorporate known risk factors alongside marker genotypes. Permuting only the marker genotypes relative to affection status and these risk factors would allow the contribution of the markers to disease risk to be independently assessed.
双等位基因标记,如单核苷酸多态性(SNP),当作多位点单倍型处理时,可为疾病基因座定位提供更多信息,但在病例对照研究中,多位点基因型往往不能直接提供单倍型。人工神经网络可用于研究疾病表型与紧密连锁标记之间的关联,无需单倍型相位信息,也无需对疾病相关单倍型的进化历史进行建模。该网络评估病例组和对照组之间标记单倍型的差异程度,以确定基于多标记基因型对疾病状态进行分类是否可行。该网络通过提供的标记基因型进行“训练”,以“识别”患病状态,然后针对每个多标记基因型生成旨在近似相关患病状态的输出。接下来,将基因型相对于患病状态进行置换,生成许多随机数据集,并重复训练和记录输出的过程。真实数据集预测患病的能力超过随机数据集的程度,衡量了多标记基因型与患病之间关联的统计显著性。这种置换检验在模拟病例对照数据集中表现良好,尤其是存在主基因效应时。我们系统地改变了不同的网络参数,以确定其最佳值。我们将置换检验应用于在2型糖尿病、糖耐量受损患者及对照的病例对照样本中分型的钙蛋白酶10(CAPN10)基因的4个SNP。我们表明,与对已分型标记数量进行校正的单标记检验相比,神经网络产生了更具高度显著性的关联证据。使用置换检验可能允许进行条件分析,将已知风险因素与标记基因型一起纳入分析。仅将标记基因型相对于患病状态和这些风险因素进行置换,将能够独立评估标记对疾病风险的贡献。