Mandel-Gutfreund Y, Baron A, Margalit H
Department of Molecular Genetics and Biotechnology, Hebrew University-Hadassah Medical School, POB 12272, Jerusalem 91120 Israel.
Pac Symp Biocomput. 2001:139-50. doi: 10.1142/9789814447362_0015.
The challenge of identifying DNA regulatory sequences based on sequence information only has been emphasized in view of the fast accumulation of new genes in the databases. While most predictive algorithms are based on multiple alignments of already known binding sites, here we examine the usefulness of a novel approach that is based on structural information of the protein-DNA complex. It has already been shown that specific recognition between a protein and its DNA target is achieved by stereo-chemical complementarity between the protein amino acids and the DNA bases. The proposed computational scheme uses crystallographic information to define the set of amino acid-base contacts between the proteins of a given DNA-binding protein family and their DNA targets. The compatibility of a given protein to bind to putative regulatory DNA sequences is then evaluated by knowledge-based parameters for amino acid-base interactions. By this procedure gene upstream regions may be screened for potential binding sites for regulatory proteins. Predictions are demonstrated for the E. coli cyclic AMP receptor protein (CRP) which recognizes the DNA via the helix-turn-helix motif, and for various Zif268-like proteins which belong to the Cys2His2 zinc finger family. The advantages and limitations of this approach are discussed.
鉴于数据库中新基因的快速积累,仅基于序列信息识别DNA调控序列的挑战已受到关注。虽然大多数预测算法是基于已知结合位点的多序列比对,但在此我们研究一种基于蛋白质-DNA复合物结构信息的新方法的实用性。已经表明,蛋白质与其DNA靶标的特异性识别是通过蛋白质氨基酸与DNA碱基之间的立体化学互补实现的。所提出的计算方案利用晶体学信息来定义给定DNA结合蛋白家族的蛋白质与它们的DNA靶标之间的氨基酸-碱基接触集。然后通过基于知识的氨基酸-碱基相互作用参数来评估给定蛋白质与推定的调控DNA序列结合的兼容性。通过这个程序,可以筛选基因上游区域以寻找调控蛋白的潜在结合位点。对通过螺旋-转角-螺旋基序识别DNA的大肠杆菌环磷酸腺苷受体蛋白(CRP)以及属于Cys2His2锌指家族的各种Zif268样蛋白进行了预测。讨论了这种方法的优点和局限性。