Thérèse Navarro Alejandro, Bourke Peter M, van de Weg Eric, Clot Corentin R, Arens Paul, Finkers Richard, Maliepaard Chris
Plant Breeding, Wageningen University & Research, Wageningen, Netherlands.
Front Genet. 2023 Mar 1;14:1049988. doi: 10.3389/fgene.2023.1049988. eCollection 2023.
Linkage mapping is an approach to order markers based on recombination events. Mapping algorithms cannot easily handle genotyping errors, which are common in high-throughput genotyping data. To solve this issue, strategies have been developed, aimed mostly at identifying and eliminating these errors. One such strategy is SMOOTH, an iterative algorithm to detect genotyping errors. Unlike other approaches, SMOOTH can also be used to impute the most probable alternative genotypes, but its application is limited to diploid species and to markers heterozygous in only one of the parents. In this study we adapted SMOOTH to expand its use to any marker type and to autopolyploids with the use of identity-by-descent probabilities, naming the updated algorithm Smooth Descent (SD). We applied SD to real and simulated data, showing that in the presence of genotyping errors this method produces better genetic maps in terms of marker order and map length. SD is particularly useful for error rates between 5% and 20% and when error rates are not homogeneous among markers or individuals. With a starting error rate of 10%, SD reduced it to ∼5% in diploids, ∼7% in tetraploids and ∼8.5% in hexaploids. Conversely, the correlation between true and estimated genetic maps increased by 0.03 in tetraploids and by 0.2 in hexaploids, while worsening slightly in diploids (∼0.0011). We also show that the combination of genotype curation and map re-estimation allowed us to obtain better genetic maps while correcting wrong genotypes. We have implemented this algorithm in the R package Smooth Descent.
连锁图谱绘制是一种基于重组事件对标记进行排序的方法。图谱绘制算法难以轻松处理基因分型错误,而这种错误在高通量基因分型数据中很常见。为了解决这个问题,人们已经开发出了一些策略,主要目的是识别和消除这些错误。其中一种策略是SMOOTH,它是一种用于检测基因分型错误的迭代算法。与其他方法不同,SMOOTH还可用于推断最可能的替代基因型,但其应用仅限于二倍体物种以及仅在一个亲本中杂合的标记。在本研究中,我们对SMOOTH进行了改进,通过使用同源概率将其应用扩展到任何标记类型和同源多倍体,并将更新后的算法命名为Smooth Descent(SD)。我们将SD应用于真实数据和模拟数据,结果表明,在存在基因分型错误的情况下,该方法在标记顺序和图谱长度方面能产生更好的遗传图谱。SD对于5%至20%的错误率特别有用,并且当标记或个体之间的错误率不均匀时也很有用。在起始错误率为10%的情况下,SD将二倍体中的错误率降低到约5%,四倍体中降低到约7%,六倍体中降低到约8.5%。相反,四倍体中真实遗传图谱与估计遗传图谱之间的相关性提高了0.03,六倍体中提高了0.2,而二倍体中则略有下降(约0.0011)。我们还表明,基因型校正和图谱重新估计的结合使我们能够在纠正错误基因型的同时获得更好的遗传图谱。我们已将此算法实现于R包Smooth Descent中。