Suppr超能文献

启发式深度学习模型的基因组预测超参数优化。

Heuristic hyperparameter optimization of deep learning models for genomic prediction.

机构信息

Department of Animal Science, Michigan State University, East Lansing, MI 48824, USA.

Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.

出版信息

G3 (Bethesda). 2021 Jul 14;11(7). doi: 10.1093/g3journal/jkab032.

Abstract

There is a growing interest among quantitative geneticists and animal breeders in the use of deep learning (DL) for genomic prediction. However, the performance of DL is affected by hyperparameters that are typically manually set by users. These hyperparameters do not simply specify the architecture of the model; they are also critical for the efficacy of the optimization and model-fitting process. To date, most DL approaches used for genomic prediction have concentrated on identifying suitable hyperparameters by exploring discrete options from a subset of the hyperparameter space. Enlarging the hyperparameter optimization search space with continuous hyperparameters is a daunting combinatorial problem. To deal with this problem, we propose using differential evolution (DE) to perform an efficient search of arbitrarily complex hyperparameter spaces in DL models, and we apply this to the specific case of genomic prediction of livestock phenotypes. This approach was evaluated on two pig and cattle datasets with real genotypes and simulated phenotypes (N = 7,539 animals and M = 48,541 markers) and one real dataset (N = 910 individuals and M = 28,916 markers). Hyperparameters were evaluated using cross-validation. We compared the predictive performance of DL models using hyperparameters optimized by DE against DL models with "best practice" hyperparameters selected from published studies and baseline DL models with randomly specified hyperparameters. Optimized models using DE showed a clear improvement in predictive performance across all three datasets. DE optimized hyperparameters also resulted in DL models with less overfitting and less variation in predictive performance over repeated retraining compared to non-optimized DL models.

摘要

越来越多的数量遗传学家和动物育种家对深度学习 (DL) 在基因组预测中的应用感兴趣。然而,DL 的性能受到超参数的影响,这些超参数通常由用户手动设置。这些超参数不仅指定了模型的架构;它们对于优化和模型拟合过程的效果也至关重要。迄今为止,用于基因组预测的大多数 DL 方法都集中在通过从超参数空间的子集探索离散选项来确定合适的超参数。扩大具有连续超参数的超参数优化搜索空间是一个令人生畏的组合问题。为了解决这个问题,我们提出使用差分进化 (DE) 在 DL 模型中对任意复杂的超参数空间进行有效的搜索,并将其应用于家畜表型的基因组预测的具体情况。该方法在两个具有真实基因型和模拟表型的猪和牛数据集(N=7539 个动物和 M=48541 个标记)以及一个真实数据集(N=910 个个体和 M=28916 个标记)上进行了评估。使用交叉验证评估了超参数。我们比较了通过 DE 优化的超参数的 DL 模型与从已发表研究中选择的“最佳实践”超参数的 DL 模型和具有随机指定超参数的基线 DL 模型的预测性能。使用 DE 优化的模型在所有三个数据集上均显示出明显的预测性能改善。与未优化的 DL 模型相比,DE 优化的超参数还导致 DL 模型在重复重新训练时具有较少的过拟合和预测性能变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d76e/8495939/549dd2dfa2cd/jkab032f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验