复合调控元件的计算分析

Computational analysis of composite regulatory elements.

作者信息

Qiu Ping, Ding Wei, Jiang Ying, Greene Jonathan R, Wang Luquan

机构信息

Bioinformatics Group and Human Genomic Research Department at Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA.

出版信息

Mamm Genome. 2002 Jun;13(6):327-32. doi: 10.1007/s00335-001-2141-8.

DOI:10.1007/s00335-001-2141-8

PMID:12115037

Abstract

Combinatorial regulation is a powerful mechanism for generating specificity in gene expression, and it is thought to play a pivotal role in the formation of the complex gene regulatory networks found in higher eukaryotes. The term "Composite Element" (CE) refers to a minimal functional unit where protein-DNA and protein-protein interactions contribute to a highly specific pattern of gene transcriptional regulation. Identification of composite elements will help to better understand gene regulation networks. Experimentally identified CEs are limited in number, and the currently available CE database COMPEL is based on such published information. Here, based on the statistical analysis of over-represented adjacent transcription factor binding sites, we describe a computational method to predict composite regulatory elements in genomic sequences. The algorithm proved to be efficient for extracting composite elements that had been experimentally confirmed and documented in the COMPEL database. Furthermore, putative new composite elements are predicted based on this method, and we have been able to confirm some of our predictions which are not included in the COMPEL database by searching published information.

摘要

组合调控是一种在基因表达中产生特异性的强大机制，并且被认为在高等真核生物中发现的复杂基因调控网络的形成中起着关键作用。术语“复合元件”（CE）指的是一个最小功能单元，其中蛋白质 - DNA和蛋白质 - 蛋白质相互作用促成了高度特异性的基因转录调控模式。复合元件的鉴定将有助于更好地理解基因调控网络。实验鉴定出的复合元件数量有限，并且当前可用的CE数据库COMPEL就是基于此类已发表信息建立的。在此，基于对过度富集的相邻转录因子结合位点的统计分析，我们描述了一种计算方法来预测基因组序列中的复合调控元件。该算法被证明对于提取在COMPEL数据库中已通过实验确认和记录的复合元件是有效的。此外，基于此方法预测了推定的新复合元件，并且通过搜索已发表信息，我们已经能够确认一些未包含在COMPEL数据库中的预测。