Suppr超能文献

使用生成对抗网络自动推断人口统计学参数。

Automatic inference of demographic parameters using generative adversarial networks.

机构信息

Department of Computer Science, Haverford College, Haverford, PA, USA.

Department of Computer Science, Swarthmore College, Swarthmore, PA, USA.

出版信息

Mol Ecol Resour. 2021 Nov;21(8):2689-2705. doi: 10.1111/1755-0998.13386. Epub 2021 May 3.

Abstract

Population genetics relies heavily on simulated data for validation, inference and intuition. In particular, since the evolutionary 'ground truth' for real data is always limited, simulated data are crucial for training supervised machine learning methods. Simulation software can accurately model evolutionary processes but requires many hand-selected input parameters. As a result, simulated data often fail to mirror the properties of real genetic data, which limits the scope of methods that rely on it. Here, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method, pg-gan, is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation-with-migration model. We then apply our method to human data from the 1000 Genomes Project and show that we can accurately recapitulate the features of real data.

摘要

群体遗传学在很大程度上依赖模拟数据进行验证、推理和直观理解。特别是,由于真实数据的进化“真实情况”总是有限的,因此模拟数据对于训练监督机器学习方法至关重要。模拟软件可以准确地模拟进化过程,但需要许多手动选择的输入参数。因此,模拟数据往往无法反映真实遗传数据的特性,这限制了依赖于模拟数据的方法的应用范围。在这里,我们开发了一种新的方法来估计群体遗传模型中的参数,该方法可以自动适应来自任何群体的数据。我们的方法 pg-gan 基于生成式对抗网络,该网络逐渐学会生成逼真的合成数据。我们证明了我们的方法能够在模拟的隔离-迁移模型中恢复输入参数。然后,我们将我们的方法应用于 1000 基因组计划中的人类数据,并表明我们可以准确地再现真实数据的特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4020/8596911/50511b296104/MEN-21-2689-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验