Lavallée-Adam Mathieu, Cloutier Philippe, Coulombe Benoit, Blanchette Mathieu
McGill Centre for Bioinformatics and School of Computer Science, McGill University, Montréal, Québec H3A 0E9, Canada.
Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ottawa, Ontario K1H 8M5, Canada.
Nucleic Acids Res. 2017 Oct 13;45(18):10415-10427. doi: 10.1093/nar/gkx751.
Biological networks are rich representations of the relationships between entities such as genes or proteins and have become increasingly complete thanks to various high-throughput network mapping experimental approaches. Here, we propose a method to use such networks to guide the search for functional sequence motifs. Specifically, we introduce Local Enrichment of Sequence Motifs in biological Networks (LESMoN), an enumerative motif discovery algorithm that identifies 5' untranslated region (UTR) sequence motifs whose associated proteins form unexpectedly dense clusters in a given biological network. When applied to the human protein-protein interaction network from BioGRID, LESMoN identifies several highly significant 5' UTR sequence motifs, including both previously known motifs and uncharacterized ones. The vast majority of these motifs are evolutionary conserved and the genes containing them are significantly enriched for various gene ontology terms suggesting new associations between 5' UTR motifs and a number of biological processes. We validate in vivo the role in protein expression regulation of three motifs identified by LESMoN.
生物网络丰富地呈现了基因或蛋白质等实体之间的关系,并且由于各种高通量网络映射实验方法,其变得越来越完整。在这里,我们提出一种利用此类网络来指导功能序列基序搜索的方法。具体而言,我们引入了生物网络中序列基序的局部富集(LESMoN),这是一种枚举基序发现算法,可识别其相关蛋白质在给定生物网络中形成意外密集簇的5'非翻译区(UTR)序列基序。当应用于来自BioGRID的人类蛋白质 - 蛋白质相互作用网络时,LESMoN识别出几个高度显著的5'UTR序列基序,包括先前已知的基序和未表征的基序。这些基序中的绝大多数是进化保守的,并且包含它们的基因在各种基因本体术语中显著富集,这表明5'UTR基序与许多生物过程之间存在新的关联。我们在体内验证了LESMoN识别出的三个基序在蛋白质表达调控中的作用。