Ronin Yefim I, Mester David I, Minkov Dina G, Akhunov Eduard, Korol Abraham B
Institute of Evolution and Department of Evolutionary and Environmental Biology, University of Haifa, 3498838, Israel.
Department of Plant Pathology, Kansas State University, Manhattan, Kansas 66506.
Genetics. 2017 Jul;206(3):1285-1295. doi: 10.1534/genetics.116.197491. Epub 2017 May 16.
The study is focused on addressing the problem of building genetic maps in the presence of ∼10-10 of markers per chromosome. We consider a spectrum of situations with intrachromosomal heterogeneity of recombination rate, different level of genotyping errors, and missing data. In the ideal scenario of the absence of errors and missing data, the majority of markers should appear as groups of cosegregating markers ("twins") representing no challenge for map construction. The central aspect of the proposed approach is to take into account the structure of the marker space, where each twin group (TG) and singleton markers are represented as points of this space. The confounding effect of genotyping errors and missing data leads to reduction of TG size, but upon a low level of these effects surviving TGs can still be used as a source of reliable skeletal markers. Increase in the level of confounding effects results in a considerable decrease in the number or even disappearance of usable TGs and, correspondingly, of skeletal markers. Here, we show that the paucity of informative markers can be compensated by detecting kernels of markers in the marker space using a clustering procedure, and demonstrate the utility of this approach for high-density genetic map construction on simulated and experimentally obtained genotyping datasets.
本研究聚焦于解决在每条染色体存在约10 - 10个标记的情况下构建遗传图谱的问题。我们考虑了一系列具有染色体内部重组率异质性、不同基因分型错误水平以及缺失数据的情况。在没有错误和缺失数据的理想情况下,大多数标记应表现为共分离标记组(“双胞胎”),这对图谱构建不构成挑战。所提出方法的核心方面是考虑标记空间的结构,其中每个双胞胎组(TG)和单态标记都表示为该空间的点。基因分型错误和缺失数据的混杂效应会导致TG大小减小,但在这些效应水平较低时,存活的TG仍可作为可靠骨架标记的来源。混杂效应水平的增加会导致可用TG数量显著减少甚至消失,相应地,骨架标记也会减少。在此,我们表明通过使用聚类程序在标记空间中检测标记核,可以弥补信息性标记的不足,并证明了该方法在模拟和实验获得的基因分型数据集上进行高密度遗传图谱构建的实用性。