用于全基因组预测的稀疏卷积神经网络

Sparse Convolutional Neural Networks for Genome-Wide Prediction.

作者信息

Waldmann Patrik, Pfeiffer Christina, Mészáros Gábor

机构信息

Department of Animal Breeding and Genetics, The Swedish University of Agriculutural Sciences, Uppsala, Sweden.

Division of Livestock Science, University of Natural Resources and Life Sciences Vienna (BOKU), Vienna, Austria.

出版信息

Front Genet. 2020 Feb 6;11:25. doi: 10.3389/fgene.2020.00025. eCollection 2020.

DOI:10.3389/fgene.2020.00025

PMID:32117441

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7029737/

Abstract

Genome-wide prediction (GWP) has become the state-of-the art method in artificial selection. Data sets often comprise number of genomic markers and individuals in ranges from a few thousands to millions. Hence, computational efficiency is important and various machine learning methods have successfully been used in GWP. Neural networks (NN) and deep learning (DL) are very flexible methods that usually show outstanding prediction properties on complex structured data, but their use in GWP is nevertheless rare and debated. This study describes a powerful NN method for genomic marker data that can easily be extended. It is shown that a one-dimensional convolutional neural network (CNN) can be used to incorporate the ordinal information between markers and, together with pooling and -norm regularization, provides a sparse and computationally efficient approach for GWP. The method, denoted CNNGWP, is implemented in the deep learning software Keras, and hyper-parameters of the NN are tuned with Bayesian optimization. Model averaged ensemble predictions further reduce prediction error. Evaluations show that CNNGWP improves prediction error by more than 25% on simulated data and around 3% on real pig data compared with results obtained with GBLUP and the LASSO. In conclusion, the CNNGWP provides a promising approach for GWP, but the magnitude of improvement depends on the genetic architecture and the heritability.

摘要

全基因组预测（GWP）已成为人工选择中的先进方法。数据集通常包含从数千到数百万不等的基因组标记数量和个体数量。因此，计算效率很重要，各种机器学习方法已成功应用于GWP。神经网络（NN）和深度学习（DL）是非常灵活的方法，通常在复杂结构数据上表现出出色的预测性能，但它们在GWP中的应用仍然很少且存在争议。本研究描述了一种适用于基因组标记数据的强大神经网络方法，该方法易于扩展。结果表明，一维卷积神经网络（CNN）可用于整合标记之间的顺序信息，并与池化和 -范数正则化一起，为GWP提供一种稀疏且计算高效的方法。该方法称为CNNGWP，在深度学习软件Keras中实现，神经网络的超参数通过贝叶斯优化进行调整。模型平均集成预测进一步降低了预测误差。评估表明，与GBLUP和LASSO的结果相比，CNNGWP在模拟数据上的预测误差降低了25%以上，在真实猪数据上降低了约3%。总之，CNNGWP为GWP提供了一种有前景的方法，但改进的幅度取决于遗传结构和遗传力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc91/7029737/45d98c1f2d4f/fgene-11-00025-g001.jpg

相似文献

Sparse Convolutional Neural Networks for Genome-Wide Prediction.用于全基因组预测的稀疏卷积神经网络

Front Genet. 2020 Feb 6;11:25. doi: 10.3389/fgene.2020.00025. eCollection 2020.

Approximate Bayesian neural networks in genomic prediction.近似贝叶斯神经网络在基因组预测中的应用。

Genet Sel Evol. 2018 Dec 22;50(1):70. doi: 10.1186/s12711-018-0439-1.

AUTALASSO: an automatic adaptive LASSO for genome-wide prediction.AUTALASSO：一种全基因组预测的自动自适应 LASSO 方法。

BMC Bioinformatics. 2019 Apr 2;20(1):167. doi: 10.1186/s12859-019-2743-3.

Using Local Convolutional Neural Networks for Genomic Prediction.使用局部卷积神经网络进行基因组预测。

Front Genet. 2020 Nov 12;11:561497. doi: 10.3389/fgene.2020.561497. eCollection 2020.

Genome-wide prediction using Bayesian additive regression trees.使用贝叶斯加法回归树进行全基因组预测。

Genet Sel Evol. 2016 Jun 10;48(1):42. doi: 10.1186/s12711-016-0219-8.

A proximal LAVA method for genome-wide association and prediction of traits with mixed inheritance patterns.一种用于全基因组关联分析和预测具有混合遗传模式性状的近端 LAVA 方法。

BMC Bioinformatics. 2021 Oct 26;22(1):523. doi: 10.1186/s12859-021-04436-6.

Genome-wide prediction for complex traits under the presence of dominance effects in simulated populations using GBLUP and machine learning methods.使用 GBLUP 和机器学习方法在模拟群体中存在显性效应的情况下对复杂性状进行全基因组预测。

J Anim Sci. 2020 Jun 1;98(6). doi: 10.1093/jas/skaa179.

Sub-sampling graph neural networks for genomic prediction of quantitative phenotypes.基于子采样图神经网络的数量性状基因组预测。

G3 (Bethesda). 2024 Nov 6;14(11). doi: 10.1093/g3journal/jkae216.

Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.Splice2Deep：用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。

Gene X. 2020 May 13;5:100035. doi: 10.1016/j.gene.2020.100035. eCollection 2020 Dec.

Tabular deep learning: a comparative study applied to multi-task genome-wide prediction.表格深度学习：应用于多任务全基因组预测的比较研究。

BMC Bioinformatics. 2024 Oct 4;25(1):322. doi: 10.1186/s12859-024-05940-1.

引用本文的文献

Disentangling soybean GxE effects in an integrated genomic prediction and machine learning-GWAS workflow.在整合基因组预测和机器学习-全基因组关联研究工作流程中解析大豆基因型与环境互作效应

Plant Methods. 2025 Aug 25;21(1):119. doi: 10.1186/s13007-025-01434-0.

Benchmarking of feed-forward neural network models for genomic prediction of quantitative traits in pigs.猪数量性状基因组预测的前馈神经网络模型基准测试

Front Genet. 2025 Jun 18;16:1618891. doi: 10.3389/fgene.2025.1618891. eCollection 2025.

Can oxidative potential be a plant risk indicator for heavy metals contaminated soil? Analysis of ryegrass ( L.) metabolome based on machine learning.氧化潜力能否作为重金属污染土壤的植物风险指标？基于机器学习的黑麦草代谢组分析。

Eco Environ Health. 2025 Mar 3;4(2):100140. doi: 10.1016/j.eehl.2025.100140. eCollection 2025 Jun.

Identification of key genes affecting intramuscular fat deposition in pigs using machine learning models.利用机器学习模型鉴定影响猪肌内脂肪沉积的关键基因。

Front Genet. 2025 Jan 6;15:1503148. doi: 10.3389/fgene.2024.1503148. eCollection 2024.

Tabular deep learning: a comparative study applied to multi-task genome-wide prediction.表格深度学习：应用于多任务全基因组预测的比较研究。

BMC Bioinformatics. 2024 Oct 4;25(1):322. doi: 10.1186/s12859-024-05940-1.

Sub-sampling graph neural networks for genomic prediction of quantitative phenotypes.基于子采样图神经网络的数量性状基因组预测。

G3 (Bethesda). 2024 Nov 6;14(11). doi: 10.1093/g3journal/jkae216.

TB-DROP: deep learning-based drug resistance prediction of Mycobacterium tuberculosis utilizing whole genome mutations.TB-DROP：基于深度学习的结核分枝杆菌全基因组突变药物耐药性预测

BMC Genomics. 2024 Feb 12;25(1):167. doi: 10.1186/s12864-024-10066-y.

Application of deep learning with bivariate models for genomic prediction of sow lifetime productivity-related traits.深度学习与二元模型在母猪终身生产性能相关性状基因组预测中的应用。

Anim Biosci. 2024 Apr;37(4):622-630. doi: 10.5713/ab.23.0264. Epub 2024 Jan 14.

Genome-wide family prediction unveils molecular mechanisms underlying the regulation of agronomic traits in .全基因组家族预测揭示了[具体作物名称]农艺性状调控背后的分子机制。需注意，原文中“in”后面缺少具体内容，我根据常见语境补充了“[具体作物名称]”，你可根据实际情况进行调整。

Front Plant Sci. 2023 Dec 12;14:1303417. doi: 10.3389/fpls.2023.1303417. eCollection 2023.

A review of machine learning models applied to genomic prediction in animal breeding.应用于动物育种基因组预测的机器学习模型综述。

Front Genet. 2023 Sep 6;14:1150596. doi: 10.3389/fgene.2023.1150596. eCollection 2023.

本文引用的文献

A Guide for Using Deep Learning for Complex Trait Genomic Prediction.深度学习在复杂性状基因组预测中的应用指南。

Genes (Basel). 2019 Jul 20;10(7):553. doi: 10.3390/genes10070553.

DSRIG: Incorporating graphical structure in the regularized modeling of SNP data.DSRIG：将图形结构纳入SNP数据的正则化建模中。

J Bioinform Comput Biol. 2019 Jun;17(3):1950017. doi: 10.1142/S0219720019500173.

Approximate Bayesian neural networks in genomic prediction.近似贝叶斯神经网络在基因组预测中的应用。

Genet Sel Evol. 2018 Dec 22;50(1):70. doi: 10.1186/s12711-018-0439-1.

A primer on deep learning in genomics.深度学习在基因组学中的应用简介。

Nat Genet. 2019 Jan;51(1):12-18. doi: 10.1038/s41588-018-0295-5. Epub 2018 Nov 26.

Precision Lasso: accounting for correlations and linear dependencies in high-dimensional genomic data.精准套索：在高维基因组数据中考虑相关性和线性依赖关系。

Bioinformatics. 2019 Apr 1;35(7):1181-1187. doi: 10.1093/bioinformatics/bty750.

Can Deep Learning Improve Genomic Prediction of Complex Human Traits?深度学习能否提高复杂人类性状的基因组预测？

Genetics. 2018 Nov;210(3):809-819. doi: 10.1534/genetics.118.301298. Epub 2018 Aug 31.

Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine.深度学习在精准医学中基因组、蛋白质组和代谢组数据集成中的应用。

OMICS. 2018 Oct;22(10):630-636. doi: 10.1089/omi.2018.0097. Epub 2018 Aug 20.

A deep convolutional neural network approach for predicting phenotypes from genotypes.一种基于深度卷积神经网络的基因型到表型预测方法。

Planta. 2018 Nov;248(5):1307-1318. doi: 10.1007/s00425-018-2976-9. Epub 2018 Aug 12.

J Theor Biol. 2018 Jan 21;437:67-78. doi: 10.1016/j.jtbi.2017.10.017. Epub 2017 Oct 18.

Deep learning in bioinformatics.生物信息学中的深度学习。

Brief Bioinform. 2017 Sep 1;18(5):851-869. doi: 10.1093/bib/bbw068.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于全基因组预测的稀疏卷积神经网络

Sparse Convolutional Neural Networks for Genome-Wide Prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献