• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于线性判别分析和装袋支持向量机的兼职蛋白识别方法。

A method for identifying moonlighting proteins based on linear discriminant analysis and bagging-SVM.

作者信息

Chen Yu, Li Sai, Guo Jifeng

机构信息

College of Information and Computer Engineering, Northeast Forestry University, Harbin, China.

出版信息

Front Genet. 2022 Aug 15;13:963349. doi: 10.3389/fgene.2022.963349. eCollection 2022.

DOI:10.3389/fgene.2022.963349
PMID:36046247
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9420859/
Abstract

Moonlighting proteins have at least two independent functions and are widely found in animals, plants and microorganisms. Moonlighting proteins play important roles in signal transduction, cell growth and movement, tumor inhibition, DNA synthesis and repair, and metabolism of biological macromolecules. Moonlighting proteins are difficult to find through biological experiments, so many researchers identify moonlighting proteins through bioinformatics methods, but their accuracies are relatively low. Therefore, we propose a new method. In this study, we select SVMProt-188D as the feature input, and apply a model combining linear discriminant analysis and basic classifiers in machine learning to study moonlighting proteins, and perform bagging ensemble on the best-performing support vector machine. They are identified accurately and efficiently. The model achieves an accuracy of 93.26% and an F-sorce of 0.946 on the MPFit dataset, which is better than the existing MEL-MP model. Meanwhile, it also achieves good results on the other two moonlighting protein datasets.

摘要

兼性蛋白质具有至少两种独立功能,广泛存在于动物、植物和微生物中。兼性蛋白质在信号转导、细胞生长与运动、肿瘤抑制、DNA合成与修复以及生物大分子代谢中发挥重要作用。兼性蛋白质难以通过生物学实验发现,因此许多研究人员通过生物信息学方法识别兼性蛋白质,但其准确性相对较低。因此,我们提出了一种新方法。在本研究中,我们选择SVMProt-188D作为特征输入,并应用机器学习中线性判别分析与基本分类器相结合的模型来研究兼性蛋白质,并对性能最佳的支持向量机进行装袋集成。它们被准确高效地识别出来。该模型在MPFit数据集上的准确率达到93.26%,F值为0.946,优于现有的MEL-MP模型。同时,它在其他两个兼性蛋白质数据集上也取得了良好的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/385c964dec1f/fgene-13-963349-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/630a7d3aa4b9/fgene-13-963349-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/a96a4a16b278/fgene-13-963349-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/b2efc84bacd4/fgene-13-963349-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/e183984d6ae6/fgene-13-963349-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/1e656cfa9a64/fgene-13-963349-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/385c964dec1f/fgene-13-963349-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/630a7d3aa4b9/fgene-13-963349-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/a96a4a16b278/fgene-13-963349-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/b2efc84bacd4/fgene-13-963349-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/e183984d6ae6/fgene-13-963349-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/1e656cfa9a64/fgene-13-963349-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff55/9420859/385c964dec1f/fgene-13-963349-g006.jpg

相似文献

1
A method for identifying moonlighting proteins based on linear discriminant analysis and bagging-SVM.一种基于线性判别分析和装袋支持向量机的兼职蛋白识别方法。
Front Genet. 2022 Aug 15;13:963349. doi: 10.3389/fgene.2022.963349. eCollection 2022.
2
Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning.使用多模态深度集成学习预测兼职蛋白
Front Genet. 2021 Mar 22;12:630379. doi: 10.3389/fgene.2021.630379. eCollection 2021.
3
Moonlighting protein prediction using physico-chemical and evolutional properties via machine learning methods.利用物理化学和进化特性通过机器学习方法进行兼职蛋白质预测。
BMC Bioinformatics. 2021 May 24;22(1):261. doi: 10.1186/s12859-021-04194-5.
4
Data-driven diagnosis of spinal abnormalities using feature selection and machine learning algorithms.基于特征选择和机器学习算法的脊柱异常数据驱动诊断。
PLoS One. 2020 Feb 6;15(2):e0228422. doi: 10.1371/journal.pone.0228422. eCollection 2020.
5
MPFit: Computational Tool for Predicting Moonlighting Proteins.MPFit:预测兼职蛋白的计算工具。
Methods Mol Biol. 2017;1611:45-57. doi: 10.1007/978-1-4939-7015-5_5.
6
IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models.IdentPMP:使用基于序列的学习模型鉴定植物中的兼职蛋白
PeerJ. 2021 Aug 6;9:e11900. doi: 10.7717/peerj.11900. eCollection 2021.
7
Accurate prediction of potential druggable proteins based on genetic algorithm and Bagging-SVM ensemble classifier.基于遗传算法和 Bagging-SVM 集成分类器的潜在可成药蛋白的准确预测。
Artif Intell Med. 2019 Jul;98:35-47. doi: 10.1016/j.artmed.2019.07.005. Epub 2019 Jul 19.
8
Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction.机器学习中特征选择的最佳评分对及其在癌症预后预测中的应用。
BMC Bioinformatics. 2011 Sep 23;12:375. doi: 10.1186/1471-2105-12-375.
9
Ensemble support vector machine classification of dementia using structural MRI and mini-mental state examination.使用结构 MRI 和简易精神状态检查对痴呆进行集成支持向量机分类。
J Neurosci Methods. 2018 May 15;302:66-74. doi: 10.1016/j.jneumeth.2018.01.003. Epub 2018 Feb 3.
10
MLSeq: Machine learning interface for RNA-sequencing data.MLSeq:用于 RNA-seq 数据的机器学习接口。
Comput Methods Programs Biomed. 2019 Jul;175:223-231. doi: 10.1016/j.cmpb.2019.04.007. Epub 2019 Apr 29.

引用本文的文献

1
Stack-VTP: prediction of vesicle transport proteins based on stacked ensemble classifier and evolutionary information.Stack-VTP:基于堆叠集成分类器和进化信息的囊泡转运蛋白预测。
BMC Bioinformatics. 2023 Apr 7;24(1):137. doi: 10.1186/s12859-023-05257-5.

本文引用的文献

1
Multitasking Na/Taurocholate Cotransporting Polypeptide (NTCP) as a Drug Target for HBV Infection: From Protein Engineering to Drug Discovery.多功能钠/牛磺胆酸共转运多肽(NTCP)作为乙肝病毒感染的药物靶点:从蛋白质工程到药物研发
Biomedicines. 2022 Jan 17;10(1):196. doi: 10.3390/biomedicines10010196.
2
DrugHybrid_BS: Using Hybrid Feature Combined With Bagging-SVM to Predict Potentially Druggable Proteins.DrugHybrid_BS:利用混合特征结合Bagging-SVM预测潜在的可成药蛋白质。
Front Pharmacol. 2021 Nov 30;12:771808. doi: 10.3389/fphar.2021.771808. eCollection 2021.
3
IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models.
IdentPMP:使用基于序列的学习模型鉴定植物中的兼职蛋白
PeerJ. 2021 Aug 6;9:e11900. doi: 10.7717/peerj.11900. eCollection 2021.
4
Moonlighting protein prediction using physico-chemical and evolutional properties via machine learning methods.利用物理化学和进化特性通过机器学习方法进行兼职蛋白质预测。
BMC Bioinformatics. 2021 May 24;22(1):261. doi: 10.1186/s12859-021-04194-5.
5
Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning.使用多模态深度集成学习预测兼职蛋白
Front Genet. 2021 Mar 22;12:630379. doi: 10.3389/fgene.2021.630379. eCollection 2021.
6
Moonlighting Proteins Are Important Players in Cancer Immunology.兼职蛋白是癌症免疫学中的重要参与者。
Front Immunol. 2021 Jan 18;11:613069. doi: 10.3389/fimmu.2020.613069. eCollection 2020.
7
Exploring associations of non-coding RNAs in human diseases via three-matrix factorization with hypergraph-regular terms on center kernel alignment.通过基于中心核对齐的超图正则项的三矩阵分解,探索人类疾病中非编码 RNA 的关联。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa409.
8
MoonProt 3.0: an update of the moonlighting proteins database.MoonProt 3.0:一个更新的蛋白质数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D368-D372. doi: 10.1093/nar/gkaa1101.
9
T4SE-XGB: Interpretable Sequence-Based Prediction of Type IV Secreted Effectors Using eXtreme Gradient Boosting Algorithm.T4SE-XGB:使用极端梯度提升算法对IV型分泌效应蛋白进行基于序列的可解释预测。
Front Microbiol. 2020 Sep 24;11:580382. doi: 10.3389/fmicb.2020.580382. eCollection 2020.
10
Single-cell multiomics: technologies and data analysis methods.单细胞多组学:技术与数据分析方法。
Exp Mol Med. 2020 Sep;52(9):1428-1442. doi: 10.1038/s12276-020-0420-2. Epub 2020 Sep 15.