Reinoso-Peláez Edgar L, Gianola Daniel, González-Recio Oscar
Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria. Ctra. de La Coruña, Madrid, Spain.
Department of Animal and Dairy Sciences, University of Wisconsin-Madison, Madison, WI, USA.
Methods Mol Biol. 2022;2467:189-218. doi: 10.1007/978-1-0716-2205-6_7.
Growth of artificial intelligence and machine learning (ML) methodology has been explosive in recent years. In this class of procedures, computers get knowledge from sets of experiences and provide forecasts or classification. In genome-wide based prediction (GWP), many ML studies have been carried out. This chapter provides a description of main semiparametric and nonparametric algorithms used in GWP in animals and plants. Thirty-four ML comparative studies conducted in the last decade were used to develop a meta-analysis through a Thurstonian model, to evaluate algorithms with the best predictive qualities. It was found that some kernel, Bayesian, and ensemble methods displayed greater robustness and predictive ability. However, the type of study and data distribution must be considered in order to choose the most appropriate model for a given problem.
近年来,人工智能和机器学习(ML)方法发展迅猛。在这类程序中,计算机从一系列经验中获取知识并进行预测或分类。在基于全基因组的预测(GWP)方面,已经开展了许多机器学习研究。本章介绍了动植物GWP中使用的主要半参数和非参数算法。利用过去十年进行的34项机器学习比较研究,通过瑟斯顿模型进行荟萃分析,以评估具有最佳预测质量的算法。结果发现,一些核方法、贝叶斯方法和集成方法表现出更强的稳健性和预测能力。然而,为给定问题选择最合适的模型时,必须考虑研究类型和数据分布。