Department of Elementary and Secondary Education, Peshawar, Khyber Pakhtunkhwa, Pakistan.
Faculty of Computing and Information Technology, King Abdulaziz University, Rabigh, Jeddah 21911, Saudi Arabia.
Comput Intell Neurosci. 2022 Sep 28;2022:2987407. doi: 10.1155/2022/2987407. eCollection 2022.
DNA-binding proteins (DBPs) have crucial biotic activities including DNA replication, recombination, and transcription. DBPs are highly concerned with chronic diseases and are used in the manufacturing of antibiotics and steroids. A series of predictors were established to identify DBPs. However, researchers are still working to further enhance the identification of DBPs. This research designed a novel predictor to identify DBPs more accurately. The features from the sequences are transformed by F-PSSM (Filtered position-specific scoring matrix), PSSM-DPC (Position specific scoring matrix-dipeptide composition), and R-PSSM (Reduced position-specific scoring matrix). To eliminate the noisy attributes, we extended DWT (discrete wavelet transform) to F-PSSM, PSSM-DPC, and R-PSSM and introduced three novel descriptors, namely, F-PSSM-DWT, PSSM-DPC-DWT, and R-PSSM-DWT. Onward, the training of the four models were performed using LiXGB (Light eXtreme gradient boosting), XGB (eXtreme gradient boosting, ERT (extremely randomized trees), and Adaboost. LiXGB with R-PSSM-DWT has attained 6.55% higher accuracy on training and 5.93% on testing dataset than the best existing predictors. The results reveal the excellent performance of our novel predictor over the past studies. DBP-iDWT would be fruitful for establishing more operative therapeutic strategies for fatal disease treatment.
DNA 结合蛋白(DBP)具有至关重要的生物活性,包括 DNA 复制、重组和转录。DBP 与慢性疾病密切相关,并且被用于抗生素和类固醇的制造。已经建立了一系列预测因子来识别 DBP。然而,研究人员仍在努力进一步提高 DBP 的识别能力。本研究设计了一种新的预测因子,以更准确地识别 DBP。通过 F-PSSM(过滤位置特异性评分矩阵)、PSSM-DPC(位置特异性评分矩阵二肽组成)和 R-PSSM(简化位置特异性评分矩阵)对序列特征进行转换。为了消除噪声属性,我们将 DWT(离散小波变换)扩展到 F-PSSM、PSSM-DPC 和 R-PSSM,并引入了三个新的描述符,即 F-PSSM-DWT、PSSM-DPC-DWT 和 R-PSSM-DWT。随后,使用 LiXGB(轻量级极端梯度提升)、XGB(极端梯度提升)、ERT(极度随机树)和 Adaboost 对这四个模型进行了训练。在训练和测试数据集上,LiXGB 与 R-PSSM-DWT 的准确率分别比现有最佳预测因子高出 6.55%和 5.93%。结果表明,我们的新预测因子在过去的研究中表现出色。DBP-iDWT 将有助于为致命疾病的治疗建立更有效的治疗策略。