Smith Lucian P, Kuhner Mary K
Department of Genome Sciences, University of Washington, Seattle, WA 98195-5065, USA.
Genet Epidemiol. 2009 May;33(4):344-56. doi: 10.1002/gepi.20387.
When a novel genetic trait arises in a population, it introduces a signal in the haplotype distribution of that population. Through recombination that signal's history becomes differentiated from the DNA distant to it, but remains similar to the DNA close by. Fine-scale mapping techniques rely on this differentiation to pinpoint trait loci. In this study, we analyzed the differentiation itself to better understand how much information is available to these techniques. Simulated alleles on known recombinant coalescent trees show the upper limit for fine-scale mapping. Varying characteristics of the population being studied increase or decrease this limit. The initial uncertainty in map position has the most direct influence on the final precision of the estimate, with wider initial areas resulting in wider final estimates, though the increase is sigmoidal rather than linear. The Theta of the trait (4Nmu) is also important, with lower values for Theta resulting in greater precision of trait placement up to a point--the increase is sigmoidal as Theta decreases. Collecting data from more individuals can increase precision, though only logarithmically with the total number of individuals, so that each added individual contributes less to the final precision. However, a case/control analysis has the potential to greatly increase the effective number of individuals, as the bulk of the information lies in the differential between affected and unaffected genotypes. If haplotypes are unknown due to incomplete penetrance, much information is lost, with more information lost the less indicative phenotype is of the underlying genotype.
当一种新的遗传性状出现在一个群体中时,它会在该群体的单倍型分布中引入一个信号。通过重组,该信号的历史与距离它较远的DNA区分开来,但与附近的DNA仍保持相似。精细定位技术依赖于这种差异来确定性状位点。在本研究中,我们分析了这种差异本身,以更好地理解这些技术可获得多少信息。已知重组合并树上的模拟等位基因显示了精细定位的上限。所研究群体的不同特征会增加或降低这一上限。图谱位置的初始不确定性对估计的最终精度有最直接的影响,初始区域越宽,最终估计范围越宽,不过这种增加是S形的而非线性的。性状的Theta值(4Nmu)也很重要,Theta值越低,在达到某一点之前性状定位的精度越高——随着Theta值降低,增加呈S形。从更多个体收集数据可以提高精度,不过仅与个体总数呈对数关系增加,因此每个新增个体对最终精度的贡献越来越小。然而,病例/对照分析有可能极大地增加有效个体数量,因为大部分信息在于患病和未患病基因型之间的差异。如果由于不完全外显而单倍型未知,则会丢失大量信息,表型对潜在基因型的指示性越小,丢失的信息就越多。