• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多特征融合和二进制萤火虫优化算法的DNA结合蛋白识别

Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm.

作者信息

Zhang Jian, Gao Bo, Chai Haiting, Ma Zhiqiang, Yang Guifu

机构信息

School of Computer Science and Information Technology, Northeast Normal University, Changchun, 130117, People's Republic of China.

Office of Informatization Management and Planning, Northeast Normal University, Changchun, 130117, People's Republic of China.

出版信息

BMC Bioinformatics. 2016 Aug 26;17(1):323. doi: 10.1186/s12859-016-1201-8.

DOI:10.1186/s12859-016-1201-8
PMID:27565741
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5002159/
Abstract

BACKGROUND

DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable.

RESULTS

In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we focused on the challenge of improving DBP prediction accuracy with information solely from the sequence. Secondly, we used multiple informative features to encode the protein. These features included evolutionary conservation profile, secondary structure motifs, and physicochemical properties. Thirdly, we introduced a novel improved Binary Firefly Algorithm (BFA) to remove redundant or noisy features as well as select optimal parameters for the classifier. The experimental results of our predictor on two benchmark datasets outperformed many state-of-the-art predictors, which revealed the effectiveness of our method. The promising prediction performance on a new-compiled independent testing dataset from PDB and a large-scale dataset from UniProt proved the good generalization ability of our method. In addition, the BFA forged in this research would be of great potential in practical applications in optimization fields, especially in feature selection problems.

CONCLUSIONS

A highly accurate method was proposed for the identification of DBPs. A user-friendly web-server named iDbP (identification of DNA-binding Proteins) was constructed and provided for academic use.

摘要

背景

DNA结合蛋白(DBP)在许多生物学过程中发挥着重要作用。因此,开发有效的计算工具来识别DBP变得非常必要。

结果

在本研究中,我们提出了一种预测DBP的准确方法。首先,我们关注仅利用序列信息提高DBP预测准确性的挑战。其次,我们使用多种信息特征对蛋白质进行编码。这些特征包括进化保守谱、二级结构基序和理化性质。第三,我们引入了一种新颖的改进型二进制萤火虫算法(BFA)来去除冗余或噪声特征,并为分类器选择最优参数。我们的预测器在两个基准数据集上的实验结果优于许多现有先进预测器,这表明了我们方法的有效性。在一个新编译的来自PDB的独立测试数据集和一个来自UniProt的大规模数据集上的良好预测性能证明了我们方法具有良好的泛化能力。此外,本研究中构建的BFA在优化领域的实际应用中,特别是在特征选择问题上具有巨大潜力。

结论

我们提出了一种用于识别DBP的高精度方法。构建了一个名为iDbP(DNA结合蛋白识别)的用户友好型网络服务器,供学术使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/194fd10cb218/12859_2016_1201_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/0d4c4e3a0724/12859_2016_1201_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/b4ff955d2b55/12859_2016_1201_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/9ac46887ac1e/12859_2016_1201_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/194fd10cb218/12859_2016_1201_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/0d4c4e3a0724/12859_2016_1201_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/b4ff955d2b55/12859_2016_1201_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/9ac46887ac1e/12859_2016_1201_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/290f/5002159/194fd10cb218/12859_2016_1201_Fig5_HTML.jpg

相似文献

1
Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm.基于多特征融合和二进制萤火虫优化算法的DNA结合蛋白识别
BMC Bioinformatics. 2016 Aug 26;17(1):323. doi: 10.1186/s12859-016-1201-8.
2
Sequence-based Detection of DNA-binding Proteins using Multiple-view Features Allied with Feature Selection.基于序列的 DNA 结合蛋白的多视图特征联合特征选择检测。
Mol Inform. 2020 Aug;39(8):e2000006. doi: 10.1002/minf.202000006. Epub 2020 Mar 23.
3
DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.DP-BINDER:一种通过融合进化和物理化学信息来预测 DNA 结合蛋白的机器学习模型。
J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.
4
HMMPred: Accurate Prediction of DNA-Binding Proteins Based on HMM Profiles and XGBoost Feature Selection.HMMPred:基于 HMM 轮廓和 XGBoost 特征选择的 DNA 结合蛋白精确预测。
Comput Math Methods Med. 2020 Mar 28;2020:1384749. doi: 10.1155/2020/1384749. eCollection 2020.
5
TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning.目标 DBP:基于序列的多视图特征学习的准确 DNA 结合蛋白预测。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1419-1429. doi: 10.1109/TCBB.2019.2893634. Epub 2019 Jan 18.
6
Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information.基于序列的具有保守性和相关性信息的蛋白质 DNA 结合残基预测。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Nov-Dec;9(6):1766-75. doi: 10.1109/TCBB.2012.106.
7
iDBPs: a web server for the identification of DNA binding proteins.iDBPs:用于鉴定 DNA 结合蛋白的网络服务器。
Bioinformatics. 2010 Mar 1;26(5):692-3. doi: 10.1093/bioinformatics/btq019. Epub 2010 Jan 19.
8
PSFM-DBT: Identifying DNA-Binding Proteins by Combing Position Specific Frequency Matrix and Distance-Bigram Transformation.PSFM-DBT:通过结合位置特异性频率矩阵和距离双字母变换识别DNA结合蛋白。
Int J Mol Sci. 2017 Aug 25;18(9):1856. doi: 10.3390/ijms18091856.
9
An evolution-based DNA-binding residue predictor using a dynamic query-driven learning scheme.一种基于进化的DNA结合残基预测器,采用动态查询驱动学习方案。
Mol Biosyst. 2016 Nov 15;12(12):3643-3650. doi: 10.1039/c6mb00626d.
10
iDNAProt-ES: Identification of DNA-binding Proteins Using Evolutionary and Structural Features.iDNAProt-ES:利用进化和结构特征鉴定 DNA 结合蛋白。
Sci Rep. 2017 Nov 2;7(1):14938. doi: 10.1038/s41598-017-14945-1.

引用本文的文献

1
Long extrachromosomal circular DNA identification by fusing sequence-derived features of physicochemical properties and nucleotide distribution patterns.通过融合物理化学性质和核苷酸分布模式的序列衍生特征来鉴定长链染色体外环状DNA
Sci Rep. 2024 Apr 24;14(1):9466. doi: 10.1038/s41598-024-57457-5.
2
A Comparative Study of Common Nature-Inspired Algorithms for Continuous Function Optimization.用于连续函数优化的常见自然启发式算法的比较研究。
Entropy (Basel). 2021 Jul 8;23(7):874. doi: 10.3390/e23070874.
3
Firefly Algorithm in Biomedical and Health Care: Advances, Issues and Challenges.

本文引用的文献

1
DNA recognition for virus assembly through multiple sequence-independent interactions with a helix-turn-helix motif.通过与螺旋-转角-螺旋基序的多个序列非依赖性相互作用进行病毒组装的DNA识别。
Nucleic Acids Res. 2016 Jan 29;44(2):776-89. doi: 10.1093/nar/gkv1467. Epub 2015 Dec 15.
2
Prediction of protein solvent accessibility using PSO-SVR with multiple sequence-derived features and weighted sliding window scheme.使用具有多序列衍生特征和加权滑动窗口方案的粒子群优化支持向量回归(PSO-SVR)预测蛋白质溶剂可及性
BioData Min. 2015 Jan 31;8:3. doi: 10.1186/s13040-014-0031-3. eCollection 2015.
3
nDNA-Prot: identification of DNA-binding proteins based on unbalanced classification.
生物医学与医疗保健中的萤火虫算法:进展、问题与挑战
SN Comput Sci. 2020;1(6):311. doi: 10.1007/s42979-020-00320-x. Epub 2020 Sep 26.
4
MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models.MPLs-Pred:基于混合序列特征和配体特异性模型预测膜蛋白-配体结合位点。
Int J Mol Sci. 2019 Jun 26;20(13):3120. doi: 10.3390/ijms20133120.
5
DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.DP-BINDER:一种通过融合进化和物理化学信息来预测 DNA 结合蛋白的机器学习模型。
J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.
6
Prediction of RNA- and DNA-Binding Proteins Using Various Machine Learning Classifiers.使用各种机器学习分类器预测RNA和DNA结合蛋白
Avicenna J Med Biotechnol. 2019 Jan-Mar;11(1):104-111.
7
High-Throughput Identification of Mammalian Secreted Proteins Using Species-Specific Scheme and Application to Human Proteome.高通量鉴定哺乳动物分泌蛋白的物种特异性方案及其在人类蛋白质组中的应用。
Molecules. 2018 Jun 14;23(6):1448. doi: 10.3390/molecules23061448.
8
PSFM-DBT: Identifying DNA-Binding Proteins by Combing Position Specific Frequency Matrix and Distance-Bigram Transformation.PSFM-DBT:通过结合位置特异性频率矩阵和距离双字母变换识别DNA结合蛋白。
Int J Mol Sci. 2017 Aug 25;18(9):1856. doi: 10.3390/ijms18091856.
9
Prediction of bioluminescent proteins by using sequence-derived features and lineage-specific scheme.利用序列衍生特征和谱系特异性方案预测生物发光蛋白。
BMC Bioinformatics. 2017 Jun 5;18(1):294. doi: 10.1186/s12859-017-1709-6.
nDNA-Prot:基于不平衡分类的 DNA 结合蛋白识别。
BMC Bioinformatics. 2014 Sep 8;15(1):298. doi: 10.1186/1471-2105-15-298.
4
PECM: prediction of extracellular matrix proteins using the concept of Chou's pseudo amino acid composition.PECM:利用周氏伪氨基酸组成概念预测细胞外基质蛋白
J Theor Biol. 2014 Dec 21;363:412-8. doi: 10.1016/j.jtbi.2014.08.002. Epub 2014 Aug 11.
5
Conformational B-cell epitopes prediction from sequences using cost-sensitive ensemble classifiers and spatial clustering.基于代价敏感集成分类器和空间聚类的序列 B 细胞构象表位预测。
Biomed Res Int. 2014;2014:689219. doi: 10.1155/2014/689219. Epub 2014 Jun 17.
6
enDNA-Prot: identification of DNA-binding proteins by applying ensemble learning.enDNA-Prot:通过应用集成学习识别DNA结合蛋白。
Biomed Res Int. 2014;2014:294279. doi: 10.1155/2014/294279. Epub 2014 May 26.
7
Sequence based prediction of DNA-binding proteins based on hybrid feature selection using random forest and Gaussian naïve Bayes.基于随机森林和高斯朴素贝叶斯混合特征选择的DNA结合蛋白序列预测
PLoS One. 2014 Jan 24;9(1):e86703. doi: 10.1371/journal.pone.0086703. eCollection 2014.
8
An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis.基于支持向量机和综合特征分析的 DNA 结合蛋白改进序列预测协议。
BMC Bioinformatics. 2013 Mar 9;14:90. doi: 10.1186/1471-2105-14-90.
9
DNA secondary structures: stability and function of G-quadruplex structures.DNA 二级结构:G-四链体结构的稳定性和功能。
Nat Rev Genet. 2012 Nov;13(11):770-80. doi: 10.1038/nrg3296. Epub 2012 Oct 3.
10
Cofactor binding evokes latent differences in DNA binding specificity between Hox proteins.辅助因子结合引发了 Hox 蛋白之间 DNA 结合特异性的潜在差异。
Cell. 2011 Dec 9;147(6):1270-82. doi: 10.1016/j.cell.2011.10.053.