Genetics and Genomic Biology, Hospital for Sick Children, Toronto, Ontario, Canada.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S38. doi: 10.1186/1471-2156-6-S1-S38.
We have developed a recursive-partitioning (RP) algorithm for identifying phenotype and covariate groupings that interact with the evidence for linkage. This data-mining approach for detecting gene x environment interactions uses genotype and covariate data on affected relative pairs to find evidence for linkage heterogeneity across covariate-defined subgroups. We adapted a likelihood-ratio based test of linkage parameterized with relative risks to a recursive partitioning framework, including a cross-validation based deviance measurement for choosing optimal tree size and a bootstrap sampling procedure for choosing robust tree structure. ALDX2 category 5 individuals were considered affected, categories 1 and 3 unaffected, and all others unknown. We sampled non-overlapping affected relative pairs from each family; therefore, we used 144 affected pairs in the RP model. Twenty pair-level covariates were defined from smoking status, maximum drinks, ethnicity, sex, and age at onset. Using the all-pairs score in GENEHUNTER, the nonparametric linkage tests showed no regions with suggestive linkage evidence. However, using the RP model, several suggestive regions were found on chromosomes 2, 4, 6, 14, and 20, with detection of associated covariates such as sex and age at onset.
我们开发了一种递归分区(RP)算法,用于识别与连锁证据相互作用的表型和协变量分组。这种用于检测基因 x 环境相互作用的数据挖掘方法使用受影响的相对对的基因型和协变量数据,在协变量定义的子组中找到连锁异质性的证据。我们将基于似然比的连锁参数化与相对风险的检验适应于递归分区框架,包括基于交叉验证的选择最优树大小的偏差度量和用于选择稳健树结构的自举抽样过程。ALDX2 类别 5 的个体被认为是受影响的,类别 1 和 3 不受影响,其余的未知。我们从每个家庭中抽取非重叠的受影响的相对对;因此,我们在 RP 模型中使用了 144 对受影响的对。从吸烟状况、最大饮酒量、种族、性别和发病年龄定义了 20 对个体水平的协变量。使用 GENEHUNTER 中的所有对得分,非参数连锁检验没有显示出具有提示性连锁证据的区域。然而,使用 RP 模型,在染色体 2、4、6、14 和 20 上发现了几个提示性区域,检测到了相关的协变量,如性别和发病年龄。