从系统发育和基因顺序数据推断基因簇的进化历史。

Inferring the evolutionary history of gene clusters from phylogenetic and gene order data.

机构信息

Département d'informatique et de recherche opérationnelle Université de Montréal, Montréal, Canada.

出版信息

Mol Biol Evol. 2010 Apr;27(4):761-72. doi: 10.1093/molbev/msp271. Epub 2009 Nov 10.

PMID:19903657

Abstract

Gene duplication is frequent within gene clusters and plays a fundamental role in evolution by providing a source of new genetic material upon which natural selection can act. Although classical phylogenetic inference methods provide some insight into the evolutionary history of a gene cluster, they are not sufficient alone to differentiate single- from multiple gene duplication events and to answer other questions regarding the nature and size of evolutionary events. In this paper, we present an algorithm allowing to infer a set of optimal evolutionary histories for a gene cluster in a single species, according to a general cost model involving variable length duplications (in tandem or inverted), deletions, and inversions. We applied our algorithm to the human olfactory receptor and protocadherin gene clusters, showing that the duplication size distribution differs significantly between the two gene families. The algorithm is available through a web interface at http://www-lbit.iro.umontreal.ca/DILTAG/.

摘要

基因重复在基因簇内很常见，它通过提供新的遗传物质来源，为自然选择提供了作用的基础，从而在进化中起着至关重要的作用。虽然经典的系统发育推断方法为研究基因簇的进化历史提供了一些见解，但它们本身不足以区分单基因重复事件和多基因重复事件，也无法回答关于进化事件的性质和大小的其他问题。在本文中，我们提出了一种算法，根据涉及可变长度重复（串联或反转）、缺失和倒位的通用成本模型，允许推断一个物种中基因簇的一组最佳进化历史。我们将我们的算法应用于人类嗅觉受体和原钙粘蛋白基因簇，结果表明这两个基因家族的重复大小分布有显著差异。该算法可通过 http://www-lbit.iro.umontreal.ca/DILTAG/ 上的网络界面获得。