BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S4. doi: 10.1186/1471-2105-15-S12-S4. Epub 2014 Nov 6.
Protein-DNA interactions are essential for many biological processes. However, the structural mechanisms underlying these interactions are not fully understood. DNA binding proteins can be classified into double-stranded DNA binding proteins (DSBs) and single-stranded DNA binding proteins (SSBs), and they take part in different biological functions. DSBs usually act as transcriptional factors to regulate the genes' expressions, while SSBs usually play roles in DNA replication, recombination, and repair, etc. Understanding the binding specificity of a DNA binding protein is helpful for the research of protein functions.
In this paper, we investigated the differences between DSBs and SSBs on surface tunnels as well as the OB-fold domain information. We detected the largest clefts on the protein surfaces, to obtain several features to be used for distinguishing the potential interfaces between SSBs and DSBs, and compared its structure with each of the six OB-fold protein templates, and use the maximal alignment score TM-score as the OB-fold feature of the protein, based on which, we constructed the support vector machine (SVM) classification model to automatically distinguish these two kinds of proteins, with prediction accuracy of 87%,83% and 83% for HOLO-set, APO-set and Mixed-set respectively.
We found that they have different ranges of tunnel lengths and tunnel curvatures; moreover, the alignment results with OB-fold templates have also found to be the discriminative feature of SSBs and DSBs. Experimental results on 10-fold cross validation indicate that the new feature set are effective to describe DNA binding proteins. The evaluation results on both bound (DNA-bound) and non-bound (DNA-free) proteins have shown the satisfactory performance of our method.
蛋白质与 DNA 的相互作用对许多生物过程至关重要。然而,这些相互作用的结构机制尚未完全了解。DNA 结合蛋白可分为双链 DNA 结合蛋白(DSBs)和单链 DNA 结合蛋白(SSBs),它们参与不同的生物学功能。DSBs 通常作为转录因子来调节基因的表达,而 SSBs 通常在 DNA 复制、重组和修复等过程中发挥作用。了解 DNA 结合蛋白的结合特异性有助于研究蛋白质的功能。
在本文中,我们研究了 DSBs 和 SSBs 在表面隧道以及 OB 折叠结构域信息上的差异。我们检测了蛋白质表面上的最大裂隙,以获取用于区分 SSBs 和 DSBs 之间潜在界面的几个特征,并将其结构与六个 OB 折叠蛋白模板中的每一个进行比较,然后使用最大对齐评分 TM 分数作为蛋白质的 OB 折叠特征,基于此,我们构建了支持向量机(SVM)分类模型,以自动区分这两种蛋白质,对 HOLO 集、APO 集和混合集的预测准确率分别为 87%、83%和 83%。
我们发现它们具有不同的隧道长度和曲率范围;此外,与 OB 折叠模板的对齐结果也被发现是 SSBs 和 DSBs 的区分特征。10 倍交叉验证的实验结果表明,新的特征集可以有效地描述 DNA 结合蛋白。对结合(DNA 结合)和非结合(DNA 游离)蛋白的评估结果表明,我们的方法具有令人满意的性能。