Suppr超能文献

基于功能的人类基因集的大小分布与分裂-合并模型。

Size distribution of function-based human gene sets and the split-merge model.

作者信息

Li Wentian, Fontanelli Oscar, Miramontes Pedro

机构信息

The Robert S. Boas Center for Genomics and Human Genetics , The Feinstein Institute for Medical Research , Northwell Health, Manhasset, NY, USA.

Departamento de Matemáticas, Facultad de Ciencias , Universidad Nacional Autónoma de México, Circuito Exterior, Ciudad Universitaria , México 04510 DF, México.

出版信息

R Soc Open Sci. 2016 Aug 3;3(8):160275. doi: 10.1098/rsos.160275. eCollection 2016 Aug.

Abstract

The sizes of paralogues-gene families produced by ancestral duplication-are known to follow a power-law distribution. We examine the size distribution of gene sets or gene families where genes are grouped by a similar function or share a common property. The size distribution of Human Gene Nomenclature Committee (HGNC) gene sets deviate from the power-law, and can be fitted much better by a beta rank function. We propose a simple mechanism to break a power-law size distribution by a combination of splitting and merging operations. The largest gene sets are split into two to account for the subfunctional categories, and a small proportion of other gene sets are merged into larger sets as new common themes might be realized. These operations are not uncommon for a curator of gene sets. A simulation shows that iteration of these operations changes the size distribution of Ensembl paralogues and could lead to a distribution fitted by a rank beta function. We further illustrate application of beta rank function by the example of distribution of transcription factors and drug target genes among HGNC gene families.

摘要

已知由祖先基因复制产生的旁系同源基因家族的大小遵循幂律分布。我们研究了基因集或基因家族的大小分布,其中基因按相似功能分组或具有共同特性。人类基因命名委员会(HGNC)基因集的大小分布偏离幂律,并且可以用β秩函数更好地拟合。我们提出了一种简单的机制,通过分裂和合并操作的组合来打破幂律大小分布。最大的基因集被分成两个以考虑亚功能类别,并且一小部分其他基因集被合并成更大的集合,因为可能会发现新的共同主题。对于基因集的管理者来说,这些操作并不罕见。模拟表明,这些操作的迭代会改变Ensembl旁系同源基因的大小分布,并可能导致由秩β函数拟合的分布。我们通过转录因子和药物靶基因在HGNC基因家族中的分布示例进一步说明了β秩函数的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cbc/5108952/91a3a3563a78/rsos160275-g1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验