Suppr超能文献

对软性选择清除的无根据热情 III:并非监督机器学习算法。

On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn't.

机构信息

Department of Biology, Lund University, Sölvegatan 35, 22362 Lund, Sweden.

Department of Biology & Biochemistry, University of Houston, Science & Research Building 2, Suite #342, 3455 Cullen Bldv., Houston, TX 77204-5001, USA.

出版信息

Genes (Basel). 2021 Apr 5;12(4):527. doi: 10.3390/genes12040527.

Abstract

In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a paper entitled "Soft sweeps are the dominant mode of adaptation in the human genome" (Schrider and Kern, . , (8), 1863-1877) attracted a great deal of attention, in particular in conjunction with another paper (Kern and Hahn, . , (6), 1366-1371), for purporting to discredit the Neutral Theory of Molecular Evolution (Kimura 1968). Here, we address an alleged novelty in Schrider and Kern's paper, i.e., the claim that their study involved an artificial intelligence technique called supervised machine learning (SML). SML is predicated upon the existence of a training dataset in which the correspondence between the input and output is known empirically to be true. Curiously, Schrider and Kern did not possess a training dataset of genomic segments known to have evolved either neutrally or through soft or hard selective sweeps. Thus, their claim of using SML is thoroughly and utterly misleading. In the absence of legitimate training datasets, Schrider and Kern used: (1) simulations that employ many manipulatable variables and (2) a system of data cherry-picking rivaling the worst excesses in the literature. These two factors, in addition to the lack of negative controls and the irreproducibility of their results due to incomplete methodological detail, lead us to conclude that all evolutionary inferences derived from so-called SML algorithms (e.g., S/HIC) should be taken with a huge shovel of salt.

摘要

在过去的 15 年左右的时间里,软选择清除机制已经从一个几乎没有进化意义的好奇心,一跃成为一种无处不在的机制,据称这种机制可以解释大多数适应性进化,甚至在某些情况下可以解释大多数进化。这种转变得益于一系列由 Daniel Schrider 和 Andrew Kern 撰写的文章。在这一系列文章中,一篇题为“软清除是人类基因组中适应的主要模式”(Schrider 和 Kern,.,(8),1863-1877)的论文引起了极大的关注,尤其是与另一篇论文(Kern 和 Hahn,.,(6),1366-1371)结合起来看,因为这两篇论文声称否定了分子进化的中性理论(Kimura 1968)。在这里,我们讨论了 Schrider 和 Kern 的论文中据称的新颖之处,即他们的研究涉及一种称为监督机器学习(SML)的人工智能技术。SML 基于存在一个训练数据集,其中输入和输出之间的对应关系在经验上被证明是真实的。奇怪的是,Schrider 和 Kern 没有拥有一个已知经历过中性进化或软选择性清除或硬选择性清除的基因组片段的训练数据集。因此,他们声称使用 SML 是完全误导性的。在没有合法训练数据集的情况下,Schrider 和 Kern 使用了:(1)使用许多可操纵变量的模拟,以及(2)与文献中最严重的过度数据选择系统。这两个因素,加上缺乏负对照组以及由于方法学细节不完整而导致结果不可重现,使我们得出结论,所有从所谓的 SML 算法(例如 S/HIC)得出的进化推断都应该谨慎对待。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c27/8066263/702c728959f7/genes-12-00527-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验