Suppr超能文献

用于在未知重组率时自我调整突变率估计的神经网络。

Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown.

机构信息

Cluster of Excellence "Machine Learning: New Perspectives for Science", University of Tübingen, Tübingen, Germany.

Department of Mathematical Stochastics, University of Freiburg, Freiburg, Germany.

出版信息

PLoS Comput Biol. 2022 Aug 3;18(8):e1010407. doi: 10.1371/journal.pcbi.1010407. eCollection 2022 Aug.

Abstract

Estimating the mutation rate, or equivalently effective population size, is a common task in population genetics. If recombination is low or high, optimal linear estimation methods are known and well understood. For intermediate recombination rates, the calculation of optimal estimators is more challenging. As an alternative to model-based estimation, neural networks and other machine learning tools could help to develop good estimators in these involved scenarios. However, if no benchmark is available it is difficult to assess how well suited these tools are for different applications in population genetics. Here we investigate feedforward neural networks for the estimation of the mutation rate based on the site frequency spectrum and compare their performance with model-based estimators. For this we use the model-based estimators introduced by Fu, Futschik et al., and Watterson that minimize the variance or mean squared error for no and free recombination. We find that neural networks reproduce these estimators if provided with the appropriate features and training sets. Remarkably, using the model-based estimators to adjust the weights of the training data, only one hidden layer is necessary to obtain a single estimator that performs almost as well as model-based estimators for low and high recombination rates, and at the same time provides a superior estimation method for intermediate recombination rates. We apply the method to simulated data based on the human chromosome 2 recombination map, highlighting its robustness in a realistic setting where local recombination rates vary and/or are unknown.

摘要

估计突变率(亦等价于有效种群大小)是群体遗传学中的一项常见任务。如果重组率较低或较高,那么就存在优化的线性估计方法,并且这些方法也被很好地理解。对于中等重组率,优化估计器的计算就更具挑战性。作为基于模型的估计的替代方法,神经网络和其他机器学习工具可以帮助在这些复杂情况下开发出良好的估计器。然而,如果没有基准,就很难评估这些工具在群体遗传学的不同应用中是多么适用。在这里,我们研究了基于位点频率谱的基于前馈神经网络的突变率估计,并将其性能与基于模型的估计器进行了比较。为此,我们使用了 Fu、Futschik 等人和 Watterson 引入的基于模型的估计器,这些估计器在没有重组和自由重组的情况下最小化方差或均方误差。我们发现,如果提供了适当的特征和训练集,神经网络可以再现这些估计器。值得注意的是,使用基于模型的估计器来调整训练数据的权重,仅需要一个隐藏层就可以获得一个单个估计器,该估计器在低重组率和高重组率下的性能几乎与基于模型的估计器一样好,同时为中等重组率提供了一种优越的估计方法。我们将该方法应用于基于人类染色体 2 重组图谱的模拟数据,突出了其在局部重组率变化和/或未知的现实情况下的稳健性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00dd/9377634/30ca64782357/pcbi.1010407.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验