Suppr超能文献

基于序列的 DNA 结合蛋白的多视图特征联合特征选择检测。

Sequence-based Detection of DNA-binding Proteins using Multiple-view Features Allied with Feature Selection.

机构信息

School of Internet of Things Engineering, Jiangnan University, Wuxi, China.

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.

出版信息

Mol Inform. 2020 Aug;39(8):e2000006. doi: 10.1002/minf.202000006. Epub 2020 Mar 23.

Abstract

DNA-binding proteins play essential roles in many molecular functions and gene regulation. Therefore, it becomes highly desirable to develop effective computational techniques for detecting DNA-binding proteins. In this paper, we proposed a new method, iDBP-DEP, which performs DNA-binding prediction by using the discriminative feature derived from multi-view feature sources including evolutionary profile, dipeptide composition, and physicochemical properties with feature selection. We evaluated iDBP-DEP on two benchmark datasets, i. e., PDB1075 and PDB594 by rigorous Jackknife test. Compared with the state-of-the-art sequence-based DNA-binding predictors, the proposed iDBP-DEP achieved 1.8 % and 3.0 % improvements of accuracy (Acc) and Mathew's Correlation Coefficient (MCC), respectively, on PDB1075 dataset; 7.4 % and 14.8 % improvements of Acc and MCC, respectively, on PDB594. The independent validation test with PDB186 show that the proposed method achieved the best performances on Acc (80.1 %) and MCC (0.684), which further demonstrated the robustness of iDBP-DEP for the detection of DNA-binding proteins. Datasets and codes used in this study are freely available at https://githup.com/Zll-codeside/iDBP-DEP.

摘要

DNA 结合蛋白在许多分子功能和基因调控中发挥着重要作用。因此,开发有效的计算技术来检测 DNA 结合蛋白变得非常重要。在本文中,我们提出了一种新的方法 iDBP-DEP,该方法通过使用来自多视图特征源(包括进化轮廓、二肽组成和物理化学性质)的判别特征,并结合特征选择来进行 DNA 结合预测。我们通过严格的 Jackknife 测试在两个基准数据集 PDB1075 和 PDB594 上评估了 iDBP-DEP。与最先进的基于序列的 DNA 结合预测器相比,在 PDB1075 数据集上,我们提出的 iDBP-DEP 在准确性 (Acc) 和马修相关系数 (MCC) 方面分别提高了 1.8%和 3.0%;在 PDB594 数据集上,Acc 和 MCC 分别提高了 7.4%和 14.8%。使用 PDB186 进行的独立验证测试表明,该方法在 Acc(80.1%)和 MCC(0.684)方面取得了最佳性能,进一步证明了 iDBP-DEP 用于检测 DNA 结合蛋白的稳健性。本研究中使用的数据集和代码可在 https://githup.com/Zll-codeside/iDBP-DEP 上免费获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验