Department of Molecular & Cellular Biology, Harvard University, Cambridge, Massachusetts, United States of America.
Howard Hughes Medical Institute, Harvard University, Cambridge, Massachusetts, United States of America.
PLoS Biol. 2020 Nov 2;18(11):e3000862. doi: 10.1371/journal.pbio.3000862. eCollection 2020 Nov.
Genes for which homologs can be detected only in a limited group of evolutionarily related species, called "lineage-specific genes," are pervasive: Essentially every lineage has them, and they often comprise a sizable fraction of the group's total genes. Lineage-specific genes are often interpreted as "novel" genes, representing genetic novelty born anew within that lineage. Here, we develop a simple method to test an alternative null hypothesis: that lineage-specific genes do have homologs outside of the lineage that, even while evolving at a constant rate in a novelty-free manner, have merely become undetectable by search algorithms used to infer homology. We show that this null hypothesis is sufficient to explain the lack of detected homologs of a large number of lineage-specific genes in fungi and insects. However, we also find that a minority of lineage-specific genes in both clades are not well explained by this novelty-free model. The method provides a simple way of identifying which lineage-specific genes call for special explanations beyond homology detection failure, highlighting them as interesting candidates for further study.
只能在有限的进化相关物种群体中检测到同源物的基因,称为“谱系特异性基因”,是普遍存在的:基本上每个谱系都有它们,而且它们通常构成该谱系总基因的相当大一部分。谱系特异性基因通常被解释为“新”基因,代表在该谱系内新产生的遗传新颖性。在这里,我们开发了一种简单的方法来检验替代的零假设:即谱系特异性基因确实在外谱系中具有同源物,即使它们以无新颖性的恒定速率进化,也只是通过用于推断同源性的搜索算法变得无法检测到。我们表明,这个零假设足以解释大量真菌和昆虫谱系特异性基因缺乏检测到的同源物的现象。然而,我们也发现,这两个类群中的少数谱系特异性基因并不能很好地用这种无新颖性模型来解释。该方法提供了一种简单的方法来识别哪些谱系特异性基因需要超越同源性检测失败的特殊解释,将它们作为进一步研究的有趣候选者突出显示。