用于在未知重组率时自我调整突变率估计的神经网络。

Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown.

机构信息

Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany.

Department of Mathematical Stochastics, University of Freiburg, Freiburg, Germany.

出版信息

PLoS Comput Biol. 2022 Aug 3;18(8):e1010407. doi: 10.1371/journal.pcbi.1010407. eCollection 2022 Aug.

DOI:10.1371/journal.pcbi.1010407

PMID:35921376

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9377634/

Abstract

Estimating the mutation rate, or equivalently effective population size, is a common task in population genetics. If recombination is low or high, optimal linear estimation methods are known and well understood. For intermediate recombination rates, the calculation of optimal estimators is more challenging. As an alternative to model-based estimation, neural networks and other machine learning tools could help to develop good estimators in these involved scenarios. However, if no benchmark is available it is difficult to assess how well suited these tools are for different applications in population genetics. Here we investigate feedforward neural networks for the estimation of the mutation rate based on the site frequency spectrum and compare their performance with model-based estimators. For this we use the model-based estimators introduced by Fu, Futschik et al., and Watterson that minimize the variance or mean squared error for no and free recombination. We find that neural networks reproduce these estimators if provided with the appropriate features and training sets. Remarkably, using the model-based estimators to adjust the weights of the training data, only one hidden layer is necessary to obtain a single estimator that performs almost as well as model-based estimators for low and high recombination rates, and at the same time provides a superior estimation method for intermediate recombination rates. We apply the method to simulated data based on the human chromosome 2 recombination map, highlighting its robustness in a realistic setting where local recombination rates vary and/or are unknown.

摘要

估计突变率（亦等价于有效种群大小）是群体遗传学中的一项常见任务。如果重组率较低或较高，那么就存在优化的线性估计方法，并且这些方法也被很好地理解。对于中等重组率，优化估计器的计算就更具挑战性。作为基于模型的估计的替代方法，神经网络和其他机器学习工具可以帮助在这些复杂情况下开发出良好的估计器。然而，如果没有基准，就很难评估这些工具在群体遗传学的不同应用中是多么适用。在这里，我们研究了基于位点频率谱的基于前馈神经网络的突变率估计，并将其性能与基于模型的估计器进行了比较。为此，我们使用了 Fu、Futschik 等人和 Watterson 引入的基于模型的估计器，这些估计器在没有重组和自由重组的情况下最小化方差或均方误差。我们发现，如果提供了适当的特征和训练集，神经网络可以再现这些估计器。值得注意的是，使用基于模型的估计器来调整训练数据的权重，仅需要一个隐藏层就可以获得一个单个估计器，该估计器在低重组率和高重组率下的性能几乎与基于模型的估计器一样好，同时为中等重组率提供了一种优越的估计方法。我们将该方法应用于基于人类染色体 2 重组图谱的模拟数据，突出了其在局部重组率变化和/或未知的现实情况下的稳健性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00dd/9377634/30ca64782357/pcbi.1010407.g001.jpg

相似文献

Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown.用于在未知重组率时自我调整突变率估计的神经网络。

PLoS Comput Biol. 2022 Aug 3;18(8):e1010407. doi: 10.1371/journal.pcbi.1010407. eCollection 2022 Aug.

A comparison of three estimators of the population-scaled recombination rate: accuracy and robustness.三种群体尺度重组率估计方法的比较：准确性与稳健性

Genetics. 2005 Dec;171(4):2051-62. doi: 10.1534/genetics.104.036293. Epub 2005 Jun 14.

On the inadmissibility of Watterson's estimator.关于沃特森估计量的不可容许性。

Theor Popul Biol. 2008 Mar;73(2):212-21. doi: 10.1016/j.tpb.2007.11.009. Epub 2007 Dec 8.

Domain-adaptive neural networks improve supervised machine learning based on simulated population genetic data.基于模拟群体遗传数据的域自适应神经网络提高监督机器学习性能。

PLoS Genet. 2023 Nov 7;19(11):e1011032. doi: 10.1371/journal.pgen.1011032. eCollection 2023 Nov.

On the Recombination Rate Estimation in the Presence of Population Substructure.关于存在群体亚结构时的重组率估计

PLoS One. 2015 Dec 30;10(12):e0145152. doi: 10.1371/journal.pone.0145152. eCollection 2015.

Estimating pairwise relatedness from dominant genetic markers.从显性遗传标记估计成对相关性。

Mol Ecol. 2004 Oct;13(10):3169-78. doi: 10.1111/j.1365-294X.2004.02298.x.

Improved Versions of Common Estimators of the Recombination Rate.重组率常见估计量的改进版本

J Comput Biol. 2016 Sep;23(9):756-68. doi: 10.1089/cmb.2016.0039. Epub 2016 Jul 13.

An estimator for the recombination rate from a continuously observed diffusion of haplotype frequencies.一种基于连续观测的单倍型频率扩散的重组率估计器。

J Math Biol. 2023 May 26;86(6):98. doi: 10.1007/s00285-023-01931-7.

Estimating the scaled mutation rate and mutation bias with site frequency data.利用位点频率数据估计尺度化突变率和突变偏差。

Theor Popul Biol. 2014 Dec;98:19-27. doi: 10.1016/j.tpb.2014.10.002. Epub 2014 Oct 18.

A readily available improvement over method of moments for intra-cluster correlation estimation in the context of cluster randomized trials and fitting a GEE-type marginal model for binary outcomes.在群组随机试验和拟合二项结局的 GEE 型边缘模型的背景下，一种现成的改进方法，可以用于估计群组内相关性。

Clin Trials. 2019 Feb;16(1):41-51. doi: 10.1177/1740774518803635. Epub 2018 Oct 8.

引用本文的文献

Assessing simulation-based supervised machine learning for demographic parameter inference from genomic data.评估基于模拟的监督式机器学习用于从基因组数据推断人口统计学参数。

Heredity (Edinb). 2025 Jun 6. doi: 10.1038/s41437-025-00773-x.

Digital Image Processing to Detect Adaptive Evolution.用于检测适应性进化的数字图像处理

Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae242.

Interpreting generative adversarial networks to infer natural selection from genetic data.从遗传数据推断自然选择的生成对抗网络解释。

Genetics. 2024 Apr 3;226(4). doi: 10.1093/genetics/iyae024.

PLoS Genet. 2023 Nov 7;19(11):e1011032. doi: 10.1371/journal.pgen.1011032. eCollection 2023 Nov.

Inference of Coalescence Times and Variant Ages Using Convolutional Neural Networks.使用卷积神经网络推断合并时间和变体年龄。

Mol Biol Evol. 2023 Oct 4;40(10). doi: 10.1093/molbev/msad211.

Harnessing deep learning for population genetic inference.利用深度学习进行群体遗传推断。

Nat Rev Genet. 2024 Jan;25(1):61-78. doi: 10.1038/s41576-023-00636-3. Epub 2023 Sep 4.

INTERPRETING GENERATIVE ADVERSARIAL NETWORKS TO INFER NATURAL SELECTION FROM GENETIC DATA.解读生成对抗网络以从遗传数据中推断自然选择

bioRxiv. 2023 Jul 9:2023.03.07.531546. doi: 10.1101/2023.03.07.531546.

Deep Learning in Population Genetics.群体遗传学中的深度学习。

Genome Biol Evol. 2023 Feb 3;15(2). doi: 10.1093/gbe/evad008.

dnadna: a deep learning framework for population genetics inference.dnadna：一个用于群体遗传学推断的深度学习框架。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac765.

本文引用的文献

A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks.一种使用可交换神经网络的群体遗传数据无似然推断框架。

Adv Neural Inf Process Syst. 2018 Dec;31:8594-8605.

Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation.用于种群大小历史推断的深度学习：设计、比较以及与近似贝叶斯计算的结合

Mol Ecol Resour. 2021 Nov;21(8):2645-2660. doi: 10.1111/1755-0998.13224. Epub 2020 Jul 25.

A community-maintained standard library of population genetic models.一个社区维护的种群遗传模型标准库。

Elife. 2020 Jun 23;9:e54967. doi: 10.7554/eLife.54967.

Predicting the Landscape of Recombination Using Deep Learning.利用深度学习预测重组景观。

Mol Biol Evol. 2020 Jun 1;37(6):1790-1808. doi: 10.1093/molbev/msaa038.

From Summary Statistics to Gene Trees: Methods for Inferring Positive Selection.从汇总统计数据到基因树：推断正选择的方法。

Trends Genet. 2020 Apr;36(4):243-258. doi: 10.1016/j.tig.2019.12.008. Epub 2020 Jan 15.

ImaGene: a convolutional neural network to quantify natural selection from genomic data.ImaGene：一种从基因组数据中定量自然选择的卷积神经网络。

BMC Bioinformatics. 2019 Nov 22;20(Suppl 9):337. doi: 10.1186/s12859-019-2927-x.

Estimating the Genome-wide Mutation Rate with Three-Way Identity by Descent.利用三亲同缘关系估计全基因组突变率。

Am J Hum Genet. 2019 Nov 7;105(5):883-893. doi: 10.1016/j.ajhg.2019.09.012. Epub 2019 Oct 3.

Accurate Inference of Tree Topologies from Multiple Sequence Alignments Using Deep Learning.使用深度学习从多重序列比对中准确推断树拓扑结构。

Syst Biol. 2020 Mar 1;69(2):221-233. doi: 10.1093/sysbio/syz060.

Ancestral Population Genomics.祖先群体基因组学

Methods Mol Biol. 2019;1910:555-589. doi: 10.1007/978-1-4939-9074-0_18.

The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference.卷积神经网络在群体遗传推断中的不合理有效性。

Mol Biol Evol. 2019 Feb 1;36(2):220-238. doi: 10.1093/molbev/msy224.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于在未知重组率时自我调整突变率估计的神经网络。

Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献