Liang K Y, Chiu Y F, Beaty T H
Department of Biostatistics, School of Hygiene and Public Health, Johns Hopkins University, Baltimore, MD 21205, USA.
Hum Hered. 2001;51(1-2):64-78. doi: 10.1159/000022961.
Multipoint linkage analysis is a powerful tool to localize susceptibility genes for complex diseases. However, the conventional lod score method relies critically on the correct specification of mode of inheritance for accurate estimation of gene position. On the other hand, allele-sharing methods, as currently practiced, are designed to test the null hypothesis of no linkage rather than estimate the location of the susceptibility gene(s). In this paper, we propose an identity-by-descent (IBD)-based procedure to estimate the location of an unobserved susceptibility gene within a chromosomal region framed by multiple markers. Here we deal with the practical situation where some of the markers might not be fully informative. Rather the IBD statistic at an arbitrary within the region is imputed using the multipoint marker information. The method is robust in that no assumption about the genetic mechanism is required other than that the region contains no more than one susceptibility gene. In particular, this approach builds upon a simple representation for the expected IBD at any arbitrary locus within the region using data from affected sib pairs. With this representation, one can carry out a parametric inference procedure to locate an unobserved susceptibility gene. In addition, here we derive a sample size formula for the number of affected sib pairs needed to detect linkage with multiple markers. Throughout, the proposed method is illustrated through simulated data. We have implemented this method including exploratory and formal model-fitting procedures to locate susceptibility genes, plus sample size and power calculations in a program, GENEFINDER, which will be made available shortly.
多点连锁分析是定位复杂疾病易感基因的有力工具。然而,传统的对数优势比分(lod score)方法严重依赖于遗传模式的正确设定,以便准确估计基因位置。另一方面,目前使用的等位基因共享方法旨在检验无连锁的零假设,而非估计易感基因的位置。在本文中,我们提出一种基于同源相同(IBD)的程序,用于估计在由多个标记界定的染色体区域内未观察到的易感基因的位置。这里我们处理的是一些标记可能并非完全信息性的实际情况。相反,利用多点标记信息来估算该区域内任意位置的IBD统计量。该方法具有稳健性,因为除了该区域包含不超过一个易感基因外,无需对遗传机制做任何假设。特别地,这种方法基于使用患病同胞对的数据对该区域内任意位点的预期IBD进行的简单表示。有了这种表示,就可以执行参数推断程序来定位未观察到的易感基因。此外,我们推导了检测与多个标记连锁所需的患病同胞对数量的样本量公式。自始至终,通过模拟数据对所提出的方法进行了说明。我们已在一个名为GENEFINDER的程序中实现了该方法,包括用于定位易感基因的探索性和正式模型拟合程序,以及样本量和效能计算,该程序不久将可供使用。