Key Laboratory of Mariculture, Ministry of Education, College of Fisheries, Ocean University of China, Qingdao 266003, China.
Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China.
Genes (Basel). 2022 Nov 29;13(12):2247. doi: 10.3390/genes13122247.
The extensive use of genomic selection (GS) in livestock and crops has led to a series of genomic-prediction (GP) algorithms despite the lack of a single algorithm that can suit all the species and traits. A systematic evaluation of available GP algorithms is thus necessary to identify the optimal GP algorithm for selective breeding in aquaculture species. In this study, a systematic comparison of ten GP algorithms, including both traditional and machine-learning algorithms, was conducted using publicly available genotype and phenotype data of eight traits, including weight and disease resistance traits, from five aquaculture species. The study aimed to provide insights into the optimal algorithm for GP in aquatic animals. Notably, no algorithm showed the best performance in all traits. However, reproducing kernel Hilbert space (RKHS) and support-vector machine (SVM) algorithms achieved relatively high prediction accuracies in most of the tested traits. Bayes A and random forest (RF) better prevented noise interference in the phenotypic data compared to the other algorithms. The prediction performances of GP algorithms in the dataset were improved by using a genome-wide association study (GWAS) to select subsets of significant SNPs. An R package, "ASGS," which integrates the commonly used traditional and machine-learning algorithms for efficiently finding the optimal algorithm, was developed to assist the application of genomic selection breeding of aquaculture species. This work provides valuable information and a tool for optimizing algorithms for GP, aiding genetic breeding in aquaculture species.
尽管缺乏一种能够适用于所有物种和特征的单一算法,但基因组选择(GS)在畜牧业和作物中的广泛应用已经催生了一系列基因组预测(GP)算法。因此,有必要对现有的 GP 算法进行系统评估,以确定水产养殖物种选择育种的最佳 GP 算法。
在这项研究中,使用来自五个水产养殖物种的八个特征(包括体重和抗病性特征)的公开可用基因型和表型数据,对包括传统算法和机器学习算法在内的十种 GP 算法进行了系统比较。该研究旨在为水生动物的 GP 最佳算法提供深入了解。值得注意的是,没有一种算法在所有特征中都表现出最佳性能。然而,核空间(RKHS)和支持向量机(SVM)算法在大多数测试特征中实现了相对较高的预测准确性。与其他算法相比,贝叶斯 A 和随机森林(RF)在表型数据中更好地防止了噪声干扰。通过使用全基因组关联研究(GWAS)选择显著 SNP 的子集,GP 算法在数据集的预测性能得到了提高。开发了一个名为“ASGS”的 R 包,它集成了常用的传统和机器学习算法,用于有效地找到最佳算法,以帮助水产养殖物种的基因组选择育种应用。这项工作为优化 GP 算法提供了有价值的信息和工具,有助于水产养殖物种的遗传育种。