• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列的 DNA 结合蛋白的多视图特征联合特征选择检测。

Sequence-based Detection of DNA-binding Proteins using Multiple-view Features Allied with Feature Selection.

机构信息

School of Internet of Things Engineering, Jiangnan University, Wuxi, China.

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.

出版信息

Mol Inform. 2020 Aug;39(8):e2000006. doi: 10.1002/minf.202000006. Epub 2020 Mar 23.

DOI:10.1002/minf.202000006
PMID:32144887
Abstract

DNA-binding proteins play essential roles in many molecular functions and gene regulation. Therefore, it becomes highly desirable to develop effective computational techniques for detecting DNA-binding proteins. In this paper, we proposed a new method, iDBP-DEP, which performs DNA-binding prediction by using the discriminative feature derived from multi-view feature sources including evolutionary profile, dipeptide composition, and physicochemical properties with feature selection. We evaluated iDBP-DEP on two benchmark datasets, i. e., PDB1075 and PDB594 by rigorous Jackknife test. Compared with the state-of-the-art sequence-based DNA-binding predictors, the proposed iDBP-DEP achieved 1.8 % and 3.0 % improvements of accuracy (Acc) and Mathew's Correlation Coefficient (MCC), respectively, on PDB1075 dataset; 7.4 % and 14.8 % improvements of Acc and MCC, respectively, on PDB594. The independent validation test with PDB186 show that the proposed method achieved the best performances on Acc (80.1 %) and MCC (0.684), which further demonstrated the robustness of iDBP-DEP for the detection of DNA-binding proteins. Datasets and codes used in this study are freely available at https://githup.com/Zll-codeside/iDBP-DEP.

摘要

DNA 结合蛋白在许多分子功能和基因调控中发挥着重要作用。因此,开发有效的计算技术来检测 DNA 结合蛋白变得非常重要。在本文中,我们提出了一种新的方法 iDBP-DEP,该方法通过使用来自多视图特征源(包括进化轮廓、二肽组成和物理化学性质)的判别特征,并结合特征选择来进行 DNA 结合预测。我们通过严格的 Jackknife 测试在两个基准数据集 PDB1075 和 PDB594 上评估了 iDBP-DEP。与最先进的基于序列的 DNA 结合预测器相比,在 PDB1075 数据集上,我们提出的 iDBP-DEP 在准确性 (Acc) 和马修相关系数 (MCC) 方面分别提高了 1.8%和 3.0%;在 PDB594 数据集上,Acc 和 MCC 分别提高了 7.4%和 14.8%。使用 PDB186 进行的独立验证测试表明,该方法在 Acc(80.1%)和 MCC(0.684)方面取得了最佳性能,进一步证明了 iDBP-DEP 用于检测 DNA 结合蛋白的稳健性。本研究中使用的数据集和代码可在 https://githup.com/Zll-codeside/iDBP-DEP 上免费获取。

相似文献

1
Sequence-based Detection of DNA-binding Proteins using Multiple-view Features Allied with Feature Selection.基于序列的 DNA 结合蛋白的多视图特征联合特征选择检测。
Mol Inform. 2020 Aug;39(8):e2000006. doi: 10.1002/minf.202000006. Epub 2020 Mar 23.
2
Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naïve Bayes.基于随机森林和高斯朴素贝叶斯混合特征选择的DNA结合蛋白序列预测
PLoS One. 2014 Jan 24;9(1):e86703. doi: 10.1371/journal.pone.0086703. eCollection 2014.
3
Improved detection of DNA-binding proteins via compression technology on PSSM information.通过基于位置特异性得分矩阵(PSSM)信息的压缩技术改进DNA结合蛋白的检测。
PLoS One. 2017 Sep 29;12(9):e0185587. doi: 10.1371/journal.pone.0185587. eCollection 2017.
4
Identification of DNA-binding proteins by Kernel Sparse Representation via L-matrix norm.基于 L 矩阵范数的核稀疏表示鉴定 DNA 结合蛋白
Comput Biol Med. 2023 Jun;159:106849. doi: 10.1016/j.compbiomed.2023.106849. Epub 2023 Apr 11.
5
Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm.基于多特征融合和二进制萤火虫优化算法的DNA结合蛋白识别
BMC Bioinformatics. 2016 Aug 26;17(1):323. doi: 10.1186/s12859-016-1201-8.
6
Identifying DNA-binding proteins based on multi-features and LASSO feature selection.基于多特征和 LASSO 特征选择鉴定 DNA 结合蛋白。
Biopolymers. 2021 Feb;112(2):e23419. doi: 10.1002/bip.23419. Epub 2021 Jan 21.
7
DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.DP-BINDER:一种通过融合进化和物理化学信息来预测 DNA 结合蛋白的机器学习模型。
J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.
8
PSFM-DBT: Identifying DNA-Binding Proteins by Combing Position Specific Frequency Matrix and Distance-Bigram Transformation.PSFM-DBT:通过结合位置特异性频率矩阵和距离双字母变换识别DNA结合蛋白。
Int J Mol Sci. 2017 Aug 25;18(9):1856. doi: 10.3390/ijms18091856.
9
A Novel Computational Method for Detecting DNA Methylation Sites with DNA Sequence Information and Physicochemical Properties.一种基于 DNA 序列信息和理化性质的新型 DNA 甲基化位点检测计算方法。
Int J Mol Sci. 2018 Feb 8;19(2):511. doi: 10.3390/ijms19020511.
10
Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information.基于序列的具有保守性和相关性信息的蛋白质 DNA 结合残基预测。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1766-75. doi: 10.1109/TCBB.2012.106.

引用本文的文献

1
Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins.基于比对和基于预训练特征表示的 DNA 结合蛋白鉴定的比较分析。
Comput Math Methods Med. 2022 Jun 28;2022:5847242. doi: 10.1155/2022/5847242. eCollection 2022.
2
iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.iFeatureOmega:一个综合性平台,用于对分子序列、结构和配体数据集的特征进行工程设计、可视化和分析。
Nucleic Acids Res. 2022 Jul 5;50(W1):W434-W447. doi: 10.1093/nar/gkac351.
3
Application of DNA-Binding Protein Prediction Based on Graph Convolutional Network and Contact Map.
基于图卷积网络和接触图的 DNA 结合蛋白预测的应用。
Biomed Res Int. 2022 Jan 17;2022:9044793. doi: 10.1155/2022/9044793. eCollection 2022.
4
iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.iLearnPlus:一个全面的、自动化的机器学习平台,用于核酸和蛋白质序列分析、预测和可视化。
Nucleic Acids Res. 2021 Jun 4;49(10):e60. doi: 10.1093/nar/gkab122.