Suppr超能文献

用于预测性育种的生成式人工智能:希望与警示

Generative AI for predictive breeding: hopes and caveats.

作者信息

Pérez-Enciso M, Zingaretti L M, de Los Campos G

机构信息

Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus UAB, 08193, Bellaterra, Barcelona, Spain.

Institució Catalana de Recerca I Estudis Avançats (ICREA), 08010, Barcelona, Spain.

出版信息

Theor Appl Genet. 2025 Jun 11;138(7):147. doi: 10.1007/s00122-025-04942-8.

Abstract

Among the broad area of artificial intelligence (AI), generative AI algorithms have emerged as a revolutionary technology able to produce highly realistic 'synthetic' data, akin to standard simulation but with fewer contraints. The main focus of generative AI has been on phenotypes, but here we argue it can serve as well for generating synthetic environments and genotypes. This data-driven technology may be able to overcome some of the limitations that standard simulations have, such as strong assumptions on the underlying genotype to phenotype map. We discuss key features of popular generative models including autoregressive models, generative adversarial networks, variational autoencoders, diffusion and flow-based models. Several of these methods utilize a latent space, often of lower dimensionality than the raw data, that can help making the models interpretable and can be a link between simulation and generative algorithms. Augmenting data as realistically as possible with genAI can improve inference and predictive performance of genomic prediction models, but symbolic simulation will continue to play a fundamental role in predictive breeding. A hybrid tool that implements both approaches can be extremely powerful to evaluate predictive breeding strategies in silico. One promising direction is to simulate novel genotypes using conventional methods, then apply generative models to produce realistic phenotypes conditional on genotype and environment.

摘要

在人工智能(AI)的广阔领域中,生成式AI算法已成为一项革命性技术,能够生成高度逼真的“合成”数据,类似于标准模拟,但限制较少。生成式AI的主要重点一直是表型,但我们在此认为它也可用于生成合成环境和基因型。这种数据驱动技术或许能够克服标准模拟所具有的一些局限性,比如对潜在基因型到表型映射的强假设。我们讨论了流行生成模型的关键特征,包括自回归模型、生成对抗网络、变分自编码器、扩散模型和基于流的模型。其中几种方法利用了一个潜在空间,其维度通常低于原始数据,这有助于使模型具有可解释性,并且可以成为模拟与生成算法之间的一个联系。利用生成式AI尽可能逼真地扩充数据可以提高基因组预测模型的推理和预测性能,但符号模拟在预测育种中仍将发挥基础性作用。同时实现这两种方法的混合工具对于在计算机上评估预测育种策略可能极其强大。一个有前景的方向是使用传统方法模拟新的基因型,然后应用生成模型根据基因型和环境生成逼真的表型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e90c/12159116/ac41d45e1a9b/122_2025_4942_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验