• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

FKRR-MVSF:一种基于模糊核岭回归模型的多视图序列特征方法,通过周的五步法则识别 DNA 结合蛋白。

FKRR-MVSF: A Fuzzy Kernel Ridge Regression Model for Identifying DNA-Binding Proteins by Multi-View Sequence Features via Chou's Five-Step Rule.

机构信息

School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China.

Engineering Research Center of Internet of Things Applied Technology, Ministry of Education, Wuxi 214122, China.

出版信息

Int J Mol Sci. 2019 Aug 26;20(17):4175. doi: 10.3390/ijms20174175.

DOI:10.3390/ijms20174175
PMID:31454964
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6747228/
Abstract

DNA-binding proteins play an important role in cell metabolism. In biological laboratories, the detection methods of DNA-binding proteins includes yeast one-hybrid methods, bacterial singles and X-ray crystallography methods and others, but these methods involve a lot of labor, material and time. In recent years, many computation-based approachs have been proposed to detect DNA-binding proteins. In this paper, a machine learning-based method, which is called the Fuzzy Kernel Ridge Regression model based on Multi-View Sequence Features (FKRR-MVSF), is proposed to identifying DNA-binding proteins. First of all, multi-view sequence features are extracted from protein sequences. Next, a Multiple Kernel Learning (MKL) algorithm is employed to combine multiple features. Finally, a Fuzzy Kernel Ridge Regression (FKRR) model is built to detect DNA-binding proteins. Compared with other methods, our model achieves good results. Our method obtains an accuracy of 83.26% and 81.72% on two benchmark datasets (PDB1075 and compared with PDB186), respectively.

摘要

DNA 结合蛋白在细胞代谢中发挥着重要作用。在生物实验室中,DNA 结合蛋白的检测方法包括酵母单杂交方法、细菌单杂交方法和 X 射线晶体学方法等,但这些方法涉及大量的人力、物力和时间。近年来,已经提出了许多基于计算的方法来检测 DNA 结合蛋白。本文提出了一种基于多视图序列特征的模糊核岭回归模型(FKRR-MVSF)的机器学习方法来识别 DNA 结合蛋白。首先,从蛋白质序列中提取多视图序列特征。然后,采用多核学习(MKL)算法来组合多个特征。最后,构建模糊核岭回归(FKRR)模型来检测 DNA 结合蛋白。与其他方法相比,我们的模型取得了较好的结果。我们的方法在两个基准数据集(PDB1075 和 PDB186)上的准确率分别为 83.26%和 81.72%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/1f6aad70059a/ijms-20-04175-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/905aea77f1b0/ijms-20-04175-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/009b43ace197/ijms-20-04175-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/00ae6bd3f703/ijms-20-04175-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/1f6aad70059a/ijms-20-04175-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/905aea77f1b0/ijms-20-04175-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/009b43ace197/ijms-20-04175-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/00ae6bd3f703/ijms-20-04175-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886e/6747228/1f6aad70059a/ijms-20-04175-g004.jpg

相似文献

1
FKRR-MVSF: A Fuzzy Kernel Ridge Regression Model for Identifying DNA-Binding Proteins by Multi-View Sequence Features via Chou's Five-Step Rule.FKRR-MVSF:一种基于模糊核岭回归模型的多视图序列特征方法,通过周的五步法则识别 DNA 结合蛋白。
Int J Mol Sci. 2019 Aug 26;20(17):4175. doi: 10.3390/ijms20174175.
2
A sequence-based multiple kernel model for identifying DNA-binding proteins.基于序列的多重核模型用于识别 DNA 结合蛋白。
BMC Bioinformatics. 2021 May 31;22(Suppl 3):291. doi: 10.1186/s12859-020-03875-x.
3
Identification of DNA-binding proteins by Kernel Sparse Representation via L-matrix norm.基于 L 矩阵范数的核稀疏表示鉴定 DNA 结合蛋白
Comput Biol Med. 2023 Jun;159:106849. doi: 10.1016/j.compbiomed.2023.106849. Epub 2023 Apr 11.
4
Sequence-based Detection of DNA-binding Proteins using Multiple-view Features Allied with Feature Selection.基于序列的 DNA 结合蛋白的多视图特征联合特征选择检测。
Mol Inform. 2020 Aug;39(8):e2000006. doi: 10.1002/minf.202000006. Epub 2020 Mar 23.
5
DPP-PseAAC: A DNA-binding protein prediction model using Chou's general PseAAC.DPP-PseAAC:一种基于 Chou 的通用 PseAAC 的 DNA 结合蛋白预测模型。
J Theor Biol. 2018 Sep 7;452:22-34. doi: 10.1016/j.jtbi.2018.05.006. Epub 2018 May 16.
6
Use Chou's 5-Step Rule to Predict DNA-Binding Proteins with Evolutionary Information.利用 Chou 的 5 步法则结合进化信息预测 DNA 结合蛋白。
Biomed Res Int. 2020 Jul 27;2020:6984045. doi: 10.1155/2020/6984045. eCollection 2020.
7
FTWSVM-SR: DNA-Binding Proteins Identification via Fuzzy Twin Support Vector Machines on Self-Representation.FTWSVM-SR:基于自表示的模糊孪生支持向量机进行 DNA 结合蛋白识别。
Interdiscip Sci. 2022 Jun;14(2):372-384. doi: 10.1007/s12539-021-00489-6. Epub 2021 Nov 6.
8
Effective DNA binding protein prediction by using key features via Chou's general PseAAC.利用周元的通用 PseAAC 算法通过关键特征预测有效 DNA 结合蛋白。
J Theor Biol. 2019 Jan 7;460:64-78. doi: 10.1016/j.jtbi.2018.10.027. Epub 2018 Oct 11.
9
TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning.目标 DBP:基于序列的多视图特征学习的准确 DNA 结合蛋白预测。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1419-1429. doi: 10.1109/TCBB.2019.2893634. Epub 2019 Jan 18.
10
MV-H-RKM: A Multiple View-Based Hypergraph Regularized Restricted Kernel Machine for Predicting DNA-Binding Proteins.MV-H-RKM:一种基于多视图的超图正则化受限核机器,用于预测DNA结合蛋白。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1246-1256. doi: 10.1109/TCBB.2022.3183191. Epub 2023 Apr 3.

引用本文的文献

1
Fuzzy kernel evidence Random Forest for identifying pseudouridine sites.基于模糊核证据的随机森林算法用于鉴定假尿嘧啶核苷位点。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae169.
2
Comparative Analysis on Alignment-Based and Pretrained Feature Representations for the Identification of DNA-Binding Proteins.基于比对和基于预训练特征表示的 DNA 结合蛋白鉴定的比较分析。
Comput Math Methods Med. 2022 Jun 28;2022:5847242. doi: 10.1155/2022/5847242. eCollection 2022.
3
A sequence-based multiple kernel model for identifying DNA-binding proteins.

本文引用的文献

1
iPhosH-PseAAC: Identify Phosphohistidine Sites in Proteins by Blending Statistical Moments and Position Relative Features According to the Chou's 5-Step Rule and General Pseudo Amino Acid Composition.iPhosH-PseAAC:根据周的五步法则和广义伪氨基酸组成,通过融合统计矩和位置相对特征来识别蛋白质中的磷酸组氨酸位点。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):596-610. doi: 10.1109/TCBB.2019.2919025. Epub 2021 Apr 6.
2
Incorporating Distance-Based Top-n-gram and Random Forest To Identify Electron Transport Proteins.基于距离的 Top-n-gram 和随机森林在鉴定电子传递蛋白中的应用。
J Proteome Res. 2019 Jul 5;18(7):2931-2939. doi: 10.1021/acs.jproteome.9b00250. Epub 2019 Jun 3.
3
基于序列的多重核模型用于识别 DNA 结合蛋白。
BMC Bioinformatics. 2021 May 31;22(Suppl 3):291. doi: 10.1186/s12859-020-03875-x.
4
PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method.PredDBP-Stack:基于堆叠集成方法的使用 HMM 轮廓预测 DNA 结合蛋白
Biomed Res Int. 2020 Apr 13;2020:7297631. doi: 10.1155/2020/7297631. eCollection 2020.
Prediction of Potential Disease-Associated MicroRNAs by Using Neural Networks.
利用神经网络预测潜在的疾病相关微小RNA
Mol Ther Nucleic Acids. 2019 Jun 7;16:566-575. doi: 10.1016/j.omtn.2019.04.010. Epub 2019 Apr 18.
4
Advances in Predicting Subcellular Localization of Multi-label Proteins and its Implication for Developing Multi-target Drugs.多标签蛋白质亚细胞定位预测的进展及其对开发多靶点药物的意义。
Curr Med Chem. 2019;26(26):4918-4943. doi: 10.2174/0929867326666190507082559.
5
Application of Machine Learning in Microbiology.机器学习在微生物学中的应用。
Front Microbiol. 2019 Apr 18;10:827. doi: 10.3389/fmicb.2019.00827. eCollection 2019.
6
dForml(KNN)-PseAAC: Detecting formylation sites from protein sequences using K-nearest neighbor algorithm via Chou's 5-step rule and pseudo components.dForml(KNN)-PseAAC:基于 K 近邻算法和 Chou 的五步法则及伪氨基酸组成,从蛋白质序列中预测甲酰化位点。
J Theor Biol. 2019 Jun 7;470:43-49. doi: 10.1016/j.jtbi.2019.03.011. Epub 2019 Mar 14.
7
SPalmitoylC-PseAAC: A sequence-based model developed via Chou's 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins.SPalmitoylC-PseAAC:一种基于序列的模型,通过 Chou 的 5 步规则和通用 PseAAC 开发,用于识别蛋白质中的 S-棕榈酰化位点。
Anal Biochem. 2019 Mar 1;568:14-23. doi: 10.1016/j.ab.2018.12.019. Epub 2018 Dec 26.
8
Gene2vec: gene subsequence embedding for prediction of mammalian -methyladenosine sites from mRNA.Gene2vec:基于基因子序列的嵌体模型,用于从 mRNA 预测哺乳动物 m6A 修饰位点。
RNA. 2019 Feb;25(2):205-218. doi: 10.1261/rna.069112.118. Epub 2018 Nov 13.
9
HITS-PR-HHblits: protein remote homology detection by combining PageRank and Hyperlink-Induced Topic Search.HITS-PR-HHblits:结合PageRank和超链接诱导主题搜索进行蛋白质远程同源性检测。
Brief Bioinform. 2020 Jan 17;21(1):298-308. doi: 10.1093/bib/bby104.
10
Effective DNA binding protein prediction by using key features via Chou's general PseAAC.利用周元的通用 PseAAC 算法通过关键特征预测有效 DNA 结合蛋白。
J Theor Biol. 2019 Jan 7;460:64-78. doi: 10.1016/j.jtbi.2018.10.027. Epub 2018 Oct 11.