Meseguer Alberto, Årman Filip, Fornes Oriol, Molina-Fernández Ruben, Bonet Jaume, Fernandez-Fuentes Narcis, Oliva Baldo
Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona, Catalonia 08005, Spain.
Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4, Canada.
NAR Genom Bioinform. 2020 Jul 1;2(3):lqaa046. doi: 10.1093/nargab/lqaa046. eCollection 2020 Sep.
Cis2-His2 zinc finger (C2H2-ZF) proteins are the largest family of transcription factors in human and higher metazoans. To date, the DNA-binding preferences of many members of this family remain unknown. We have developed a computational method to predict their DNA-binding preferences. We have computed theoretical position weight matrices (PWMs) of proteins composed by C2H2-ZF domains, with the only requirement of an input structure. We have predicted more than two-third of a single zinc-finger domain binding site for about 70% variants of Zif268, a classical member of this family. We have successfully matched between 60 and 90% of the binding-site motif of examples of proteins composed by three C2H2-ZF domains in JASPAR, a standard database of PWMs. The tests are used as a proof of the capacity to scan a DNA fragment and find the potential binding sites of transcription-factors formed by C2H2-ZF domains. As an example, we have tested the approach to predict the DNA-binding preferences of the human chromatin binding factor CTCF. We offer a server to model the structure of a zinc-finger protein and predict its PWM.
Cis2-His2锌指(C2H2-ZF)蛋白是人类和高等后生动物中最大的转录因子家族。迄今为止,该家族许多成员的DNA结合偏好仍不清楚。我们开发了一种计算方法来预测它们的DNA结合偏好。我们计算了由C2H2-ZF结构域组成的蛋白质的理论位置权重矩阵(PWM),唯一的要求是输入结构。对于该家族的经典成员Zif268的约70%变体,我们预测了超过三分之二的单个锌指结构域结合位点。在PWM的标准数据库JASPAR中,我们成功匹配了由三个C2H2-ZF结构域组成的蛋白质示例的结合位点基序的60%至90%。这些测试被用作扫描DNA片段并找到由C2H2-ZF结构域形成的转录因子潜在结合位点能力的证明。例如,我们测试了预测人类染色质结合因子CTCF的DNA结合偏好的方法。我们提供了一个服务器来模拟锌指蛋白的结构并预测其PWM。