Suppr超能文献

需要双向最佳比对或共线性缺失的同系物推断方法会错失许多比对。

Homoeolog Inference Methods Requiring Bidirectional Best Hits or Synteny Miss Many Pairs.

机构信息

SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Center for Integrative Genomics, University of Lausanne, Switzerland.

出版信息

Genome Biol Evol. 2021 Jun 8;13(6). doi: 10.1093/gbe/evab077.

Abstract

Homoeologs are pairs of genes or chromosomes in the same species that originated by speciation and were brought back together in the same genome by allopolyploidization. Bioinformatic methods for accurate homoeology inference are crucial for studying the evolutionary consequences of polyploidization, and homoeology is typically inferred on the basis of bidirectional best hit (BBH) and/or positional conservation (synteny). However, these methods neglect the fact that genes can duplicate and move, both prior to and after the allopolyploidization event. These duplications and movements can result in many-to-many and/or nonsyntenic homoeologs-which thus remain undetected and unstudied. Here, using the allotetraploid upland cotton (Gossypium hirsutum) as a case study, we show that conventional approaches indeed miss a substantial proportion of homoeologs. Additionally, we found that many of the missed pairs of homoeologs are broadly and highly expressed. A gene ontology analysis revealed a high proportion of the nonsyntenic and non-BBH homoeologs to be involved in protein translation and are likely to contribute to the functional repertoire of cotton. Thus, from an evolutionary and functional genomics standpoint, choosing a homoeolog inference method which does not solely rely on 1:1 relationship cardinality or synteny is crucial for not missing these potentially important homoeolog pairs.

摘要

同源基因是指在同一物种中通过物种形成而产生的基因或染色体对,它们通过异源多倍化被重新组合到同一个基因组中。准确推断同源基因的生物信息学方法对于研究多倍化的进化后果至关重要,同源基因通常是基于双向最佳匹配(BBH)和/或位置保守性(共线性)来推断的。然而,这些方法忽略了一个事实,即基因可以在多倍化事件之前和之后发生复制和移动。这些复制和移动会导致多对多和/或非共线性的同源基因,这些基因因此仍然未被发现和研究。在这里,我们以异源四倍体陆地棉(Gossypium hirsutum)为例,表明传统方法确实会错过大量的同源基因。此外,我们发现许多被错过的同源基因对广泛且高度表达。GO 分析表明,非共线性和非 BBH 同源基因的很大一部分参与了蛋白质翻译,可能为棉花的功能库做出了贡献。因此,从进化和功能基因组学的角度来看,选择一种不依赖于 1:1 关系基数或共线性的同源基因推断方法对于不遗漏这些潜在重要的同源基因对至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33f2/8214411/9cd831ea3997/evab077f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验