• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

群体遗传学中的深度学习。

Deep Learning in Population Genetics.

机构信息

Professorship for Population Genetics, Department of Life Science Systems, Technical University of Munich, Germany.

Centre for Biological Diversity, Sir Harold Mitchell Building, University of St Andrews, Fife KY16 9TF, UK.

出版信息

Genome Biol Evol. 2023 Feb 3;15(2). doi: 10.1093/gbe/evad008.

DOI:10.1093/gbe/evad008
PMID:36683406
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9897193/
Abstract

Population genetics is transitioning into a data-driven discipline thanks to the availability of large-scale genomic data and the need to study increasingly complex evolutionary scenarios. With likelihood and Bayesian approaches becoming either intractable or computationally unfeasible, machine learning, and in particular deep learning, algorithms are emerging as popular techniques for population genetic inferences. These approaches rely on algorithms that learn non-linear relationships between the input data and the model parameters being estimated through representation learning from training data sets. Deep learning algorithms currently employed in the field comprise discriminative and generative models with fully connected, convolutional, or recurrent layers. Additionally, a wide range of powerful simulators to generate training data under complex scenarios are now available. The application of deep learning to empirical data sets mostly replicates previous findings of demography reconstruction and signals of natural selection in model organisms. To showcase the feasibility of deep learning to tackle new challenges, we designed a branched architecture to detect signals of recent balancing selection from temporal haplotypic data, which exhibited good predictive performance on simulated data. Investigations on the interpretability of neural networks, their robustness to uncertain training data, and creative representation of population genetic data, will provide further opportunities for technological advancements in the field.

摘要

群体遗传学正在向数据驱动的学科转变,这要归功于大规模基因组数据的可用性,以及研究日益复杂的进化场景的需要。由于似然和贝叶斯方法变得要么难以处理,要么在计算上不可行,机器学习,特别是深度学习算法,正在成为群体遗传推断的流行技术。这些方法依赖于通过从训练数据集进行表示学习来学习输入数据和正在估计的模型参数之间的非线性关系的算法。目前在该领域中使用的深度学习算法包括具有全连接、卷积或递归层的判别和生成模型。此外,现在有各种各样强大的模拟器可以在复杂场景下生成训练数据。深度学习在实证数据集上的应用主要复制了以前在模式生物中重建人口统计学和自然选择信号的发现。为了展示深度学习解决新挑战的可行性,我们设计了一个分支架构,用于从时间单倍型数据中检测近期平衡选择的信号,该架构在模拟数据上表现出良好的预测性能。对神经网络的可解释性、对不确定训练数据的鲁棒性以及对群体遗传数据的创造性表示的研究,将为该领域的技术进步提供进一步的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/b7172fbe23be/evad008f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/096b32363c79/evad008f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/eb6a44916842/evad008f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/b7172fbe23be/evad008f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/096b32363c79/evad008f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/eb6a44916842/evad008f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da82/9897193/b7172fbe23be/evad008f3.jpg

相似文献

1
Deep Learning in Population Genetics.群体遗传学中的深度学习。
Genome Biol Evol. 2023 Feb 3;15(2). doi: 10.1093/gbe/evad008.
2
ImaGene: a convolutional neural network to quantify natural selection from genomic data.ImaGene:一种从基因组数据中定量自然选择的卷积神经网络。
BMC Bioinformatics. 2019 Nov 22;20(Suppl 9):337. doi: 10.1186/s12859-019-2927-x.
3
Data Integration Using Advances in Machine Learning in Drug Discovery and Molecular Biology.利用机器学习进展进行药物发现和分子生物学中的数据整合
Methods Mol Biol. 2021;2190:167-184. doi: 10.1007/978-1-0716-0826-5_7.
4
The deep arbitrary polynomial chaos neural network or how Deep Artificial Neural Networks could benefit from data-driven homogeneous chaos theory.深度任意多项式混沌神经网络或深度人工神经网络如何从数据驱动的均匀混沌理论中受益。
Neural Netw. 2023 Sep;166:85-104. doi: 10.1016/j.neunet.2023.06.036. Epub 2023 Jul 10.
5
Deep convolutional neural network and IoT technology for healthcare.用于医疗保健的深度卷积神经网络和物联网技术。
Digit Health. 2024 Jan 17;10:20552076231220123. doi: 10.1177/20552076231220123. eCollection 2024 Jan-Dec.
6
Extending approximate Bayesian computation with supervised machine learning to infer demographic history from genetic polymorphisms using DIYABC Random Forest.使用 DIYABC 随机森林将带监督机器学习的近似贝叶斯计算扩展到使用遗传多态性推断人口历史。
Mol Ecol Resour. 2021 Nov;21(8):2598-2613. doi: 10.1111/1755-0998.13413. Epub 2021 May 21.
7
The power of deep learning to ligand-based novel drug discovery.深度学习在基于配体的新药发现中的作用。
Expert Opin Drug Discov. 2020 Jul;15(7):755-764. doi: 10.1080/17460441.2020.1745183. Epub 2020 Mar 31.
8
Interpreting generative adversarial networks to infer natural selection from genetic data.从遗传数据推断自然选择的生成对抗网络解释。
Genetics. 2024 Apr 3;226(4). doi: 10.1093/genetics/iyae024.
9
Novel deep neural network based pattern field classification architectures.基于新型深度神经网络的模式场分类架构。
Neural Netw. 2020 Jul;127:82-95. doi: 10.1016/j.neunet.2020.03.011. Epub 2020 Mar 14.
10
Harnessing deep learning for population genetic inference.利用深度学习进行群体遗传推断。
Nat Rev Genet. 2024 Jan;25(1):61-78. doi: 10.1038/s41576-023-00636-3. Epub 2023 Sep 4.

引用本文的文献

1
On the use of generative models for evolutionary inference of malaria vectors from genomic data.关于使用生成模型从基因组数据进行疟疾病媒进化推断的研究
bioRxiv. 2025 Jun 27:2025.06.26.661760. doi: 10.1101/2025.06.26.661760.
2
Coalescence and Translation: A Language Model for Population Genetics.合并与翻译:一种用于群体遗传学的语言模型
bioRxiv. 2025 Jun 27:2025.06.24.661337. doi: 10.1101/2025.06.24.661337.
3
Discriminating models of trait evolution.性状进化的判别模型。

本文引用的文献

1
: a framework for spatio-temporal population genomic simulations on geographic landscapes.用于地理景观上时空种群基因组模拟的框架
Peer Community J. 2023;3. doi: 10.24072/pcjournal.354. Epub 2023 Dec 15.
2
Neural ADMIXTURE for rapid genomic clustering.用于快速基因组聚类的神经混合模型
Nat Comput Sci. 2023 Jul;3(7):621-629. doi: 10.1038/s43588-023-00482-7. Epub 2023 Jul 6.
3
Weak seed banks influence the signature and detectability of selective sweeps.弱种子库影响选择清除的特征和可检测性。
bioRxiv. 2025 Jun 13:2025.06.12.659377. doi: 10.1101/2025.06.12.659377.
4
Genomic Anomaly Detection with Functional Data Analysis.基于功能数据分析的基因组异常检测
Genes (Basel). 2025 Jun 15;16(6):710. doi: 10.3390/genes16060710.
5
Assessing simulation-based supervised machine learning for demographic parameter inference from genomic data.评估基于模拟的监督式机器学习用于从基因组数据推断人口统计学参数。
Heredity (Edinb). 2025 Jun 6. doi: 10.1038/s41437-025-00773-x.
6
Opportunities and Challenges in Applying AI to Evolutionary Morphology.将人工智能应用于进化形态学的机遇与挑战。
Integr Org Biol. 2024 Sep 23;6(1):obae036. doi: 10.1093/iob/obae036. eCollection 2024.
7
Fast simulation of identity-by-descent segments.同源片段的快速模拟。
Bull Math Biol. 2025 May 23;87(7):84. doi: 10.1007/s11538-025-01464-8.
8
Methylomes Reveal Recent Evolutionary Changes in Populations of Two Plant Species.甲基化组揭示了两种植物种群近期的进化变化。
Genome Biol Evol. 2025 May 30;17(6). doi: 10.1093/gbe/evaf101.
9
Constructing ancestral recombination graphs through reinforcement learning.通过强化学习构建祖先重组图。
Front Genet. 2025 Apr 29;16:1569358. doi: 10.3389/fgene.2025.1569358. eCollection 2025.
10
Efficient Detection and Characterization of Targets of Natural Selection Using Transfer Learning.利用迁移学习对自然选择目标进行高效检测与特征描述
Mol Biol Evol. 2025 Apr 30;42(5). doi: 10.1093/molbev/msaf094.
J Evol Biol. 2023 Sep;36(9):1282-1294. doi: 10.1111/jeb.14204. Epub 2023 Aug 7.
4
dnadna: a deep learning framework for population genetics inference.dnadna:一个用于群体遗传学推断的深度学习框架。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac765.
5
Deciphering signatures of natural selection via deep learning.通过深度学习破译自然选择的特征。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac354.
6
High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios.对扩展的 1000 基因组项目队列进行高覆盖率全基因组测序,包括 602 个三核苷酸重复序列。
Cell. 2022 Sep 1;185(18):3426-3440.e19. doi: 10.1016/j.cell.2022.08.004.
7
Explaining a series of models by propagating Shapley values.通过传播 Shapley 值来解释一系列模型。
Nat Commun. 2022 Aug 3;13(1):4512. doi: 10.1038/s41467-022-31384-3.
8
Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown.用于在未知重组率时自我调整突变率估计的神经网络。
PLoS Comput Biol. 2022 Aug 3;18(8):e1010407. doi: 10.1371/journal.pcbi.1010407. eCollection 2022 Aug.
9
The sequences of 150,119 genomes in the UK Biobank.英国生物库中 150119 个基因组的序列。
Nature. 2022 Jul;607(7920):732-740. doi: 10.1038/s41586-022-04965-x. Epub 2022 Jul 20.
10
Haplotype and population structure inference using neural networks in whole-genome sequencing data.使用全基因组测序数据中的神经网络进行单倍型和群体结构推断。
Genome Res. 2022 Aug 25;32(8):1542-1552. doi: 10.1101/gr.276813.122.