Suppr超能文献

使用 HMM 轮廓预测单链和双链 DNA 结合蛋白。

Single-stranded and double-stranded DNA-binding protein prediction using HMM profiles.

机构信息

School of Electrical and Electronics Engineering, Fiji National University, Suva, Fiji.

Laboratory of Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, 230-0045, Japan; Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University (TMDU), Tokyo, 113-8510, Japan; Laboratory of Medical Science Mathematics, Department of Biological Sciences, Graduate School of Science, University of Tokyo, Tokyo, 113-0033, Japan.

出版信息

Anal Biochem. 2021 Jan 1;612:113954. doi: 10.1016/j.ab.2020.113954. Epub 2020 Sep 15.

Abstract

BACKGROUND

DNA-binding proteins perform important roles in cellular processes and are involved in many biological activities. These proteins include crucial protein-DNA binding domains and can interact with single-stranded or double-stranded DNA, and accordingly classified as single-stranded DNA-binding proteins (SSBs) or double-stranded DNA-binding proteins (DSBs). Computational prediction of SSBs and DSBs helps in annotating protein functions and understanding of protein-binding domains.

RESULTS

Performance is reported using the DNA-binding protein dataset that was recently introduced by Wang et al., [1]. The proposed method achieved a sensitivity of 0.600, specificity of 0.792, AUC of 0.758, MCC of 0.369, accuracy of 0.744, and F-measure of 0.536, on the independent test set.

CONCLUSION

The proposed method with the hidden Markov model (HMM) profiles for feature extraction, outperformed the benchmark method in the literature and achieved an overall improvement of approximately 3%. The source code and supplementary information of the proposed method is available at https://github.com/roneshsharma/Predict-DNA-binding-proteins/wiki.

摘要

背景

DNA 结合蛋白在细胞过程中发挥着重要作用,参与许多生物活性。这些蛋白质包括关键的蛋白-DNA 结合域,可与单链或双链 DNA 相互作用,因此分为单链 DNA 结合蛋白 (SSB) 或双链 DNA 结合蛋白 (DSB)。SSB 和 DSB 的计算预测有助于注释蛋白质功能和理解蛋白质结合域。

结果

使用 Wang 等人最近提出的 DNA 结合蛋白数据集报告性能[1]。在所提出的方法中,在独立测试集中,灵敏度为 0.600,特异性为 0.792,AUC 为 0.758,MCC 为 0.369,准确性为 0.744,F1 分数为 0.536。

结论

在所提出的方法中,使用隐马尔可夫模型 (HMM) 进行特征提取,优于文献中的基准方法,并实现了约 3%的整体改进。该方法的源代码和补充信息可在 https://github.com/roneshsharma/Predict-DNA-binding-proteins/wiki 上获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验