Computer Science and Engineering and Information Technology Department, Shiraz University, Shiraz, Iran.
Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan; Institute for Integrated and Intelligent Systems, Griffith University, Nathan, Brisbane, QLD 4111, Australia.
Gene. 2023 Feb 15;853:147045. doi: 10.1016/j.gene.2022.147045. Epub 2022 Nov 26.
DNA-binding proteins play a vital role in biological activity including DNA replication, DNA packing, and DNA reparation. DNA-binding proteins can be classified into single-stranded DNA-binding proteins (SSBs) or double-stranded DNA-binding proteins (DSBs). Determining whether a protein is DSB or SSB helps determine the protein's function. Therefore, many studies have been conducted to accurately identify DSB and SSB in recent years. Despite all the efforts have been made so far, the DSB and SSB prediction performance remains limited. In this study, we propose a new method called CNN-Pred to accurately predict DSB and SSB. To build CNN-Pred, we first extract evolutionary-based features in the form of mono-gram and bi-gram profiles using position specific scoring matrix (PSSM). We then, use 1D-convolutional neural network (CNN) as the classifier to our extracted features. Our results demonstrate that CNN-Pred can enhance the DSB and SSB prediction accuracies by more than 4%, on the independent test compared to previous studies found in the literature. CNN-pred as a standalone tool and all its source codes are publicly available at: https://github.com/MLBC-lab/CNN-Pred.
DNA 结合蛋白在包括 DNA 复制、DNA 包装和 DNA 修复在内的生物活性中起着至关重要的作用。DNA 结合蛋白可分为单链 DNA 结合蛋白(SSB)或双链 DNA 结合蛋白(DSB)。确定蛋白质是 DSB 还是 SSB 有助于确定蛋白质的功能。因此,近年来许多研究都致力于准确识别 DSB 和 SSB。尽管迄今为止已经付出了所有努力,但 DSB 和 SSB 的预测性能仍然有限。在这项研究中,我们提出了一种名为 CNN-Pred 的新方法,以准确预测 DSB 和 SSB。为了构建 CNN-Pred,我们首先使用位置特异性评分矩阵(PSSM)以单字和双字图的形式提取基于进化的特征。然后,我们使用一维卷积神经网络(CNN)作为分类器对我们提取的特征进行分类。我们的结果表明,与文献中发现的先前研究相比,CNN-Pred 可以将独立测试中的 DSB 和 SSB 预测准确性提高 4%以上。CNN-pred 作为一个独立的工具,其所有源代码都可以在:https://github.com/MLBC-lab/CNN-Pred 上获得。