Wong Ka-Chun, Li Yue, Peng Chengbin, Moses Alan M, Zhang Zhaolei
Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada CSAIL, Massachusetts Institute of Technology, Cambridge, MA 02139-4307, USA.
Nucleic Acids Res. 2015 Dec 2;43(21):10180-9. doi: 10.1093/nar/gkv1134. Epub 2015 Nov 2.
The protein-DNA interactions between transcription factors and transcription factor binding sites are essential activities in gene regulation. To decipher the binding codes, it is a long-standing challenge to understand the binding mechanism across different transcription factor DNA binding families. Past computational learning studies usually focus on learning and predicting the DNA binding residues on protein side. Taking into account both sides (protein and DNA), we propose and describe a computational study for learning the specificity-determining residue-nucleotide interactions of different known DNA-binding domain families. The proposed learning models are compared to state-of-the-art models comprehensively, demonstrating its competitive learning performance. In addition, we describe and propose two applications which demonstrate how the learnt models can provide meaningful insights into protein-DNA interactions across different DNA binding families.
转录因子与转录因子结合位点之间的蛋白质 - DNA相互作用是基因调控中的关键活动。为了解密结合密码,理解不同转录因子DNA结合家族的结合机制一直是一项长期挑战。过去的计算学习研究通常专注于学习和预测蛋白质一侧的DNA结合残基。考虑到双方(蛋白质和DNA),我们提出并描述了一项计算研究,用于学习不同已知DNA结合域家族的特异性决定残基 - 核苷酸相互作用。将所提出的学习模型与最先进的模型进行了全面比较,证明了其具有竞争力的学习性能。此外,我们描述并提出了两个应用,展示了所学模型如何能够为不同DNA结合家族的蛋白质 - DNA相互作用提供有意义的见解。