Templeton A R, Sing C F
Department of Biology, Washington University, St. Louis, Missouri 63130.
Genetics. 1993 Jun;134(2):659-69. doi: 10.1093/genetics/134.2.659.
We previously developed an analytical strategy based on cladistic theory to identify subsets of haplotypes that are associated with significant phenotypic deviations. Our initial approach was limited to segments of DNA in which little recombination occurs. In such cases, a cladogram can be constructed from the restriction site data to estimate the evolutionary steps that interrelate the observed haplotypes to one another. The cladogram is then used to define a nested statistical design for identifying mutational steps associated with significant phenotypic deviations. The central assumption behind this strategy is that a mutation responsible for a particular phenotypic effect is embedded within the evolutionary history that is represented by the cladogram. The power of this approach depends on the accuracy of the cladogram in portraying the evolutionary history of the DNA region. This accuracy can be diminished both by recombination and by uncertainty in the estimated cladogram topology. In a previous paper, we presented an algorithm for estimating the set of likely claodgrams and recombination events. In this paper we present an algorithm for defining a nested statistical design under cladogram uncertainty and recombination. Given the nested design, phenotypic associations can be examined using either a nested analysis of variance (for haploids or homozygous strains) or permutation testing (for outcrossed, diploid gene regions). In this paper we also extend this analytical strategy to include categorical phenotypes in addition to quantitative phenotypes. Some worked examples are presented using Drosophila data sets. These examples illustrate that having some recombination may actually enhance the biological inferences that may derived from a cladistic analysis. In particular, recombination can be used to assign a physical localization to a given subregion for mutations responsible for significant phenotypic effects.
我们之前基于分支系统学理论开发了一种分析策略,以识别与显著表型偏差相关的单倍型子集。我们最初的方法仅限于很少发生重组的DNA片段。在这种情况下,可以根据限制性酶切位点数据构建一个分支图,以估计将观察到的单倍型相互关联起来的进化步骤。然后,该分支图用于定义一个嵌套统计设计,以识别与显著表型偏差相关的突变步骤。该策略背后的核心假设是,导致特定表型效应的突变嵌入在由分支图所代表的进化历史中。这种方法的效力取决于分支图描绘DNA区域进化历史的准确性。这种准确性可能会因重组以及估计的分支图拓扑结构中的不确定性而降低。在之前的一篇论文中,我们提出了一种算法来估计可能的分支图和重组事件的集合。在本文中,我们提出了一种在分支图不确定性和重组情况下定义嵌套统计设计的算法。给定该嵌套设计,可以使用嵌套方差分析(用于单倍体或纯合菌株)或置换检验(用于杂交的二倍体基因区域)来检验表型关联。在本文中,我们还将这种分析策略扩展到除了定量表型之外还包括分类表型。使用果蝇数据集给出了一些实例。这些实例表明,存在一些重组实际上可能会增强从分支系统学分析中得出的生物学推断。特别是,重组可用于为导致显著表型效应的突变指定一个给定子区域的物理定位。