Preedy K F, Hackett C A
Biomathematics and Statistics Scotland, Invergowrie, Dundee, DD2 5DA, UK.
Theor Appl Genet. 2016 Nov;129(11):2117-2132. doi: 10.1007/s00122-016-2761-8. Epub 2016 Aug 9.
The paper proposes and validates a robust method for rapid construction of high-density linkage maps suitable for autotetraploid species. Modern genotyping techniques are producing increasingly high numbers of genetic markers that can be scored in experimental populations of plants and animals. Ordering these markers to form a reliable linkage map is computationally challenging. There is a wide literature on this topic, but most has focussed on populations derived from diploid, homozygous parents. The challenge of ordering markers in an autotetraploid population has received little attention, and there is currently no method that runs sufficiently rapidly to investigate the effects of omitting problematic markers on map order in larger datasets. Here, we have explored the use of multidimensional scaling (MDS) to order markers from a cross between autotetraploid parents, using simulated data with 74-152 markers on a linkage group and also experimental data from a potato population. We compared different functions of the recombination fraction and LOD score to form the MDS stress function and found that an LOD weighting generally performed well, including when missing values and genotyping errors are present. We conclude that an initial analysis using unconstrained MDS gives a rapid method to detect and remove problematic markers, and that a subsequent analysis using either constrained MDS or principal curve analysis gives reliable marker orders. The latter approach is also particularly rapid, taking less than 10 s on a set of 258 markers compared to 6 days for the JoinMap software. This MDS approach could also be applied to experimental populations of diploid species.
本文提出并验证了一种适用于同源四倍体物种的高密度连锁图谱快速构建的稳健方法。现代基因分型技术产生的遗传标记数量越来越多,这些标记可在动植物实验群体中进行评分。将这些标记排序以形成可靠的连锁图谱在计算上具有挑战性。关于这个主题有大量文献,但大多数都集中在源自二倍体纯合亲本的群体上。在同源四倍体群体中对标记进行排序的挑战很少受到关注,目前还没有一种方法能够快速运行以研究在更大数据集中省略有问题的标记对图谱顺序的影响。在这里,我们利用连锁群上有74 - 152个标记的模拟数据以及来自马铃薯群体的实验数据,探索了使用多维尺度分析(MDS)对同源四倍体亲本杂交产生的标记进行排序。我们比较了重组率和LOD得分的不同函数以形成MDS应力函数,发现LOD加权通常表现良好,包括存在缺失值和基因分型错误的情况。我们得出结论,使用无约束MDS进行初步分析可提供一种快速检测和去除有问题标记的方法,随后使用约束MDS或主曲线分析进行分析可得到可靠的标记顺序。后一种方法也特别快速,对于一组258个标记,它在不到10秒内就能完成,而JoinMap软件则需要6天时间。这种MDS方法也可应用于二倍体物种的实验群体。