Université Paris-Saclay, INRAE, AgroParisTech, GABI, Jouy-en-Josas, France.
SYSAAF (French Poultry and Aquaculture Breeders Technical Centre), Rennes, France.
PLoS Comput Biol. 2024 Sep 24;20(9):e1012483. doi: 10.1371/journal.pcbi.1012483. eCollection 2024 Sep.
Triploidy is very useful in both aquaculture and some cultivated plants as the induced sterility helps to enhance growth and product quality, as well as acting as a barrier against the contamination of wild populations by escapees. To use genetic information from triploids for academic or breeding purposes, an efficient and robust method to genotype triploids is needed. We developed such a method for genotype calling from SNP arrays, and we implemented it in the R package named GenoTriplo. Our method requires no prior information on cluster positions and remains unaffected by shifted luminescence signals. The method relies on starting the clustering algorithm with an initial higher number of groups than expected from the ploidy level of the samples, followed by merging groups that are too close to each other to be considered as distinct genotypes. Accurate classification of SNPs is achieved through multiple thresholds of quality controls. We compared the performance of GenoTriplo with that of fitPoly, the only published method for triploid SNP genotyping with a free software access. This was assessed by comparing the genotypes generated by both methods for a dataset of 1232 triploid rainbow trout genotyped for 38,033 SNPs. The two methods were consistent for 89% of the genotypes, but for 26% of the SNPs, they exhibited a discrepancy in the number of different genotypes identified. For these SNPs, GenoTriplo had >95% concordance with fitPoly when fitPoly genotyped better. On the contrary, when GenoTriplo genotyped better, fitPoly had less than 50% concordance with GenoTriplo. GenoTriplo was more robust with less genotyping errors. It is also efficient at identifying low-frequency genotypes in the sample set. Finally, we assessed parentage assignment based on GenoTriplo genotyping and observed significant differences in mismatch rates between the best and second-best couples, indicating high confidence in the results. GenoTriplo could also be used to genotype diploids as well as individuals with higher ploidy level by adjusting a few input parameters.
三倍体在水产养殖和一些栽培植物中非常有用,因为诱导的不育性有助于提高生长和产品质量,并且可以防止逃逸的野生种群的污染。为了将三倍体的遗传信息用于学术或育种目的,需要一种高效、稳健的方法来对三倍体进行基因分型。我们开发了一种用于 SNP 阵列基因分型的方法,并将其实现为名为 GenoTriplo 的 R 包。我们的方法不需要事先了解聚类位置,并且不受移位的发光信号的影响。该方法依赖于以比样本的倍性水平预期更高的初始聚类数开始聚类算法,然后合并彼此太接近而无法被视为不同基因型的聚类。通过多个质量控制的阈值来实现 SNP 的准确分类。我们将 GenoTriplo 的性能与 fitPoly 进行了比较,fitPoly 是唯一一种具有免费软件访问权限的用于三倍体 SNP 基因分型的已发表方法。这是通过比较两种方法对 1232 个三倍体虹鳟鱼的 38033 个 SNP 数据集生成的基因型来评估的。两种方法在 89%的基因型上是一致的,但在 26%的 SNP 上,它们在识别的不同基因型数量上存在差异。对于这些 SNP,当 fitPoly 基因分型更好时,GenoTriplo 与 fitPoly 的一致性>95%。相反,当 GenoTriplo 基因分型更好时,fitPoly 与 GenoTriplo 的一致性小于 50%。GenoTriplo 的基因分型错误更少,更稳健。它还能够有效地识别样本集中的低频基因型。最后,我们根据 GenoTriplo 基因分型进行了亲子关系分配,并观察到最佳和第二佳配对之间的错配率存在显著差异,这表明结果具有很高的可信度。通过调整几个输入参数,GenoTriplo 还可以用于基因分型二倍体以及更高倍性水平的个体。