Pritykin Yuri, Ghersi Dario, Singh Mona
Department of Computer Science, Princeton University, Princeton, New Jersey, United States of America; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America.
Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America; School of Interdisciplinary Informatics, University of Nebraska at Omaha, Omaha, Nebraska, United States of America.
PLoS Comput Biol. 2015 Oct 5;11(10):e1004467. doi: 10.1371/journal.pcbi.1004467. eCollection 2015 Oct.
Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms--H. sapiens, D. melanogaster, and S. cerevisiae--and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality.
许多基因可在多个生物学过程或分子功能中发挥作用。在全基因组水平上识别多功能基因并研究其特性,能够揭示支撑细胞功能的分子事件的复杂性,从而有助于更好地理解细胞的功能格局。然而,迄今为止,对多功能基因(及其编码的蛋白质)的全基因组分析仍然有限。在此,我们介绍一种计算方法,该方法利用已知的功能注释来提取在至少两个不同生物学过程中发挥作用的基因。我们利用人类、黑腹果蝇和酿酒酵母这三种生物的功能基因组数据集,结果表明,与其他注释基因相比,参与多个生物学过程的基因具有独特的物理化学性质,表达更为广泛,在蛋白质相互作用网络中往往更处于中心位置,在进化上往往更保守,并且更有可能是必需的。我们还发现,多功能基因更有可能与人类疾病有关。当根据分子功能而非生物学过程来定义多功能性时,这些相同的特征依然成立。我们的分析揭示了多功能基因的关键特征,是朝着在全基因组水平上更好地理解基因多功能性迈出的一步。