Suppr超能文献

基因家族扩展测量与相关性

Gene-Family Extension Measures and Correlations.

作者信息

Carmi Gon, Bolshoy Alexander

机构信息

Department of Evolutionary and Environmental Biology, University of Haifa, Haifa 3498838, Israel.

出版信息

Life (Basel). 2016 Aug 3;6(3):30. doi: 10.3390/life6030030.

Abstract

The existence of multiple copies of genes is a well-known phenomenon. A gene family is a set of sufficiently similar genes, formed by gene duplication. In earlier works conducted on a limited number of completely sequenced and annotated genomes it was found that size of gene family and size of genome are positively correlated. Additionally, it was found that several atypical microbes deviated from the observed general trend. In this study, we reexamined these associations on a larger dataset consisting of 1484 prokaryotic genomes and using several ranking approaches. We applied ranking methods in such a way that genomes with lower numbers of gene copies would have lower rank. Until now only simple ranking methods were used; we applied the Kemeny optimal aggregation approach as well. Regression and correlation analysis were utilized in order to accurately quantify and characterize the relationships between measures of paralog indices and genome size. In addition, boxplot analysis was employed as a method for outlier detection. We found that, in general, all paralog indexes positively correlate with an increase of genome size. As expected, different groups of atypical prokaryotic genomes were found for different types of paralog quantities. Mycoplasmataceae and Halobacteria appeared to be among the most interesting candidates for further research of evolution through gene duplication.

摘要

基因多拷贝的存在是一种众所周知的现象。基因家族是由基因复制形成的一组相似度足够高的基因。在早期对数量有限的完全测序和注释基因组进行的研究中发现,基因家族的大小与基因组的大小呈正相关。此外,还发现一些非典型微生物偏离了观察到的总体趋势。在本研究中,我们在一个由1484个原核生物基因组组成的更大数据集上,并使用几种排序方法重新审视了这些关联。我们应用排序方法时,使基因拷贝数较少的基因组具有较低的排名。到目前为止只使用了简单的排序方法;我们还应用了凯梅尼最优聚合方法。利用回归和相关分析来准确量化和表征旁系同源指数度量与基因组大小之间的关系。此外,采用箱线图分析作为异常值检测方法。我们发现,一般来说,所有旁系同源指数都与基因组大小的增加呈正相关。正如预期的那样,对于不同类型的旁系同源数量,发现了不同组的非典型原核生物基因组。支原体科和嗜盐菌似乎是通过基因复制进行进化的进一步研究中最有趣的候选对象。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f5/5041006/5d61a7be931a/life-06-00030-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验