Bioinformatics and Molecular Evolution Unit, Department of Biology, National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland.
Mol Biol Evol. 2014 Mar;31(3):501-16. doi: 10.1093/molbev/mst228. Epub 2013 Nov 22.
Defining homologous genes is important in many evolutionary studies but raises obvious issues. Some of these issues are conceptual and stem from our assumptions of how a gene evolves, others are practical, and depend on the algorithmic decisions implemented in existing software. Therefore, to make progress in the study of homology, both ontological and epistemological questions must be considered. In particular, defining homologous genes cannot be solely addressed under the classic assumptions of strong tree thinking, according to which genes evolve in a strictly tree-like fashion of vertical descent and divergence and the problems of homology detection are primarily methodological. Gene homology could also be considered under a different perspective where genes evolve as "public goods," subjected to various introgressive processes. In this latter case, defining homologous genes becomes a matter of designing models suited to the actual complexity of the data and how such complexity arises, rather than trying to fit genetic data to some a priori tree-like evolutionary model, a practice that inevitably results in the loss of much information. Here we show how important aspects of the problems raised by homology detection methods can be overcome when even more fundamental roots of these problems are addressed by analyzing public goods thinking evolutionary processes through which genes have frequently originated. This kind of thinking acknowledges distinct types of homologs, characterized by distinct patterns, in phylogenetic and nonphylogenetic unrooted or multirooted networks. In addition, we define "family resemblances" to include genes that are related through intermediate relatives, thereby placing notions of homology in the broader context of evolutionary relationships. We conclude by presenting some payoffs of adopting such a pluralistic account of homology and family relationship, which expands the scope of evolutionary analyses beyond the traditional, yet relatively narrow focus allowed by a strong tree-thinking view on gene evolution.
定义同源基因在许多进化研究中很重要,但也带来了明显的问题。其中一些问题是概念性的,源于我们对基因进化方式的假设,另一些则是实际问题,取决于现有软件中实施的算法决策。因此,要在同源性研究中取得进展,必须考虑本体论和认识论问题。特别是,根据经典的强树思维假设,不能仅仅解决同源基因的定义问题,根据该假设,基因以垂直进化和分歧的严格树状方式进化,同源性检测的问题主要是方法性的。在另一种观点下,也可以考虑基因进化作为“公共物品”,受到各种内侵过程的影响。在后一种情况下,定义同源基因就变成了设计适合实际数据复杂性的模型的问题,以及这种复杂性是如何产生的问题,而不是试图将遗传数据拟合到某种先验的树状进化模型中,这种做法不可避免地会导致大量信息的丢失。在这里,我们通过分析基因频繁起源的公共物品思维进化过程,展示了当甚至更基本的问题根源得到解决时,同源性检测方法提出的问题的重要方面如何得到克服。这种思维承认了在系统发育和非系统发育无根或多根网络中具有不同模式的不同类型的同源物。此外,我们定义“家族相似性”包括通过中间亲属相关的基因,从而将同源性概念置于进化关系的更广泛背景中。最后,我们提出了采用这种多元同源性和家族关系概念的一些好处,这将进化分析的范围扩展到了传统的、但相对狭窄的强树思维观点允许的基因进化范围之外。