Singh Param Priya, Arora Jatin, Isambert Hervé
CNRS UMR168, UPMC, Institut Curie, Research Center, Paris, France.
PLoS Comput Biol. 2015 Jul 16;11(7):e1004394. doi: 10.1371/journal.pcbi.1004394. eCollection 2015 Jul.
Whole genome duplications (WGD) have now been firmly established in all major eukaryotic kingdoms. In particular, all vertebrates descend from two rounds of WGDs, that occurred in their jawless ancestor some 500 MY ago. Paralogs retained from WGD, also coined 'ohnologs' after Susumu Ohno, have been shown to be typically associated with development, signaling and gene regulation. Ohnologs, which amount to about 20 to 35% of genes in the human genome, have also been shown to be prone to dominant deleterious mutations and frequently implicated in cancer and genetic diseases. Hence, identifying ohnologs is central to better understand the evolution of vertebrates and their susceptibility to genetic diseases. Early computational analyses to identify vertebrate ohnologs relied on content-based synteny comparisons between the human genome and a single invertebrate outgroup genome or within the human genome itself. These approaches are thus limited by lineage specific rearrangements in individual genomes. We report, in this study, the identification of vertebrate ohnologs based on the quantitative assessment and integration of synteny conservation between six amniote vertebrates and six invertebrate outgroups. Such a synteny comparison across multiple genomes is shown to enhance the statistical power of ohnolog identification in vertebrates compared to earlier approaches, by overcoming lineage specific genome rearrangements. Ohnolog gene families can be browsed and downloaded for three statistical confidence levels or recompiled for specific, user-defined, significance criteria at http://ohnologs.curie.fr/. In the light of the importance of WGD on the genetic makeup of vertebrates, our analysis provides a useful resource for researchers interested in gaining further insights on vertebrate evolution and genetic diseases.
全基因组复制(WGD)现已在所有主要的真核生物界中得到了确凿的证实。特别是,所有脊椎动物都起源于大约5亿年前在其无颌祖先中发生的两轮WGD。从WGD保留下来的旁系同源基因,在Susumu Ohno之后也被称为“ohnologs”,已被证明通常与发育、信号传导和基因调控有关。在人类基因组中,ohnologs约占基因总数的20%至35%,它们还被证明容易发生显性有害突变,并经常与癌症和遗传疾病有关。因此,识别ohnologs对于更好地理解脊椎动物的进化及其对遗传疾病的易感性至关重要。早期用于识别脊椎动物ohnologs的计算分析依赖于人类基因组与单个无脊椎动物外群基因组之间或人类基因组内部基于内容的同线性比较。因此,这些方法受到单个基因组中特定谱系重排的限制。在本研究中,我们报告了基于对六种羊膜动物脊椎动物和六种无脊椎动物外群之间同线性保守性的定量评估和整合来识别脊椎动物ohnologs的方法。与早期方法相比,这种跨多个基因组的同线性比较通过克服特定谱系的基因组重排,提高了脊椎动物中ohnolog识别的统计能力。可以在http://ohnologs.curie.fr/浏览和下载三种统计置信水平的ohnolog基因家族,或者根据特定的、用户定义的显著性标准重新编译。鉴于WGD对脊椎动物基因组成的重要性,我们的分析为有兴趣深入了解脊椎动物进化和遗传疾病的研究人员提供了有用的资源。