• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用物理化学和进化特性通过机器学习方法进行兼职蛋白质预测。

Moonlighting protein prediction using physico-chemical and evolutional properties via machine learning methods.

机构信息

Laboratory of Bioinformatics and Drug Design, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.

Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran.

出版信息

BMC Bioinformatics. 2021 May 24;22(1):261. doi: 10.1186/s12859-021-04194-5.

DOI:10.1186/s12859-021-04194-5
PMID:34030624
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8142502/
Abstract

BACKGROUND

Moonlighting proteins (MPs) are a subclass of multifunctional proteins in which more than one independent or usually distinct function occurs in a single polypeptide chain. Identification of unknown cellular processes, understanding novel protein mechanisms, improving the prediction of protein functions, and gaining information about protein evolution are the main reasons to study MPs. They also play an important role in disease pathways and drug-target discovery. Since detecting MPs experimentally is quite a challenge, most of them are detected randomly. Therefore, introducing an appropriate computational approach to predict MPs seems reasonable.

RESULTS

In this study, we introduced a competent model for detecting moonlighting and non-MPs through extracted features from protein sequences. We attempted to set up a well-judged scheme for detecting outlier proteins. Consequently, 37 distinct feature vectors were utilized to study each protein's impact on detecting MPs. Furthermore, 8 different classification methods were assessed to find the best performance. To detect outliers, each one of the classifications was executed 100 times by tenfold cross-validation on feature vectors; proteins which misclassified 90 times or more were grouped. This process was applied to every single feature vector and eventually the intersection of these groups was determined as the outlier proteins. The results of tenfold cross-validation on a dataset of 351 samples (containing 215 moonlighting and 136 non-moonlighting proteins) reveal that the SVM method on all feature vectors has the highest performance among all methods in this study and other available methods. Besides, the study of outliers showed that 57 of 351 proteins in the dataset could be an appropriate candidate for the outlier. Among the outlier proteins, there were non-MPs (such as P69797) that have been misclassified in 8 different classification methods with 16 different feature vectors. Because these proteins have been obtained by computational methods, the results of this study could reduce the likelihood of hypothesizing whether these proteins are non-moonlighting at all.

CONCLUSIONS

MPs are difficult to be identified through experimentation. Using distinct feature vectors, our method enabled identification of novel moonlighting proteins. The study also pinpointed that a number of non-MPs are likely to be moonlighting.

摘要

背景

Moonlighting 蛋白(MPs)是多功能蛋白的一个子类,其中一个多肽链中会发生不止一个独立的或通常不同的功能。研究 MPs 的主要原因包括识别未知的细胞过程、理解新的蛋白质机制、改进蛋白质功能的预测以及获取有关蛋白质进化的信息。它们在疾病途径和药物靶点发现中也起着重要作用。由于实验检测 MPs 相当具有挑战性,因此大多数 MPs 都是随机检测到的。因此,引入一种适当的计算方法来预测 MPs 似乎是合理的。

结果

在这项研究中,我们通过从蛋白质序列中提取的特征,引入了一种检测 Moonlighting 和非-MPs 的有效模型。我们试图建立一个合理的方案来检测异常蛋白。因此,利用了 37 个不同的特征向量来研究每个蛋白质对检测 MPs 的影响。此外,评估了 8 种不同的分类方法以找到最佳性能。为了检测异常值,每种分类方法在特征向量上通过 10 倍交叉验证执行 100 次;被错误分类 90 次或更多次的蛋白质被归为一组。这个过程应用于每个单独的特征向量,最终确定这些组的交集为异常蛋白。在包含 215 个 Moonlighting 和 136 个非-Moonlighting 蛋白的 351 个样本数据集上进行的 10 倍交叉验证的结果表明,在本研究和其他可用方法中,所有方法中 SVM 方法在所有特征向量上的性能最高。此外,异常值的研究表明,在数据集的 351 个蛋白中有 57 个蛋白可能是异常值的合适候选蛋白。在异常蛋白中,有非 MPs(如 P69797)在 8 种不同的分类方法和 16 种不同的特征向量中被错误分类 16 次。由于这些蛋白质是通过计算方法获得的,因此本研究的结果可以降低假设这些蛋白质是否根本不是非 MPs 的可能性。

结论

MPs 通过实验很难识别。使用不同的特征向量,我们的方法能够识别新的 Moonlighting 蛋白。该研究还指出,许多非 MPs 可能是 Moonlighting。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/77a56d80aaf0/12859_2021_4194_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/1a62cafd7c13/12859_2021_4194_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/5241b5615047/12859_2021_4194_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/b381ff9f09fa/12859_2021_4194_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/77a56d80aaf0/12859_2021_4194_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/1a62cafd7c13/12859_2021_4194_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/5241b5615047/12859_2021_4194_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/b381ff9f09fa/12859_2021_4194_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fb3/8142502/77a56d80aaf0/12859_2021_4194_Fig4_HTML.jpg

相似文献

1
Moonlighting protein prediction using physico-chemical and evolutional properties via machine learning methods.利用物理化学和进化特性通过机器学习方法进行兼职蛋白质预测。
BMC Bioinformatics. 2021 May 24;22(1):261. doi: 10.1186/s12859-021-04194-5.
2
Genome-scale prediction of moonlighting proteins using diverse protein association information.利用多种蛋白质关联信息对兼职蛋白进行全基因组规模预测。
Bioinformatics. 2016 Aug 1;32(15):2281-8. doi: 10.1093/bioinformatics/btw166. Epub 2016 Mar 26.
3
Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning.使用多模态深度集成学习预测兼职蛋白
Front Genet. 2021 Mar 22;12:630379. doi: 10.3389/fgene.2021.630379. eCollection 2021.
4
DextMP: deep dive into text for predicting moonlighting proteins.DextMP:深入挖掘文本以预测兼职蛋白。
Bioinformatics. 2017 Jul 15;33(14):i83-i91. doi: 10.1093/bioinformatics/btx231.
5
DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.DP-BINDER:一种通过融合进化和物理化学信息来预测 DNA 结合蛋白的机器学习模型。
J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.
6
A method for identifying moonlighting proteins based on linear discriminant analysis and bagging-SVM.一种基于线性判别分析和装袋支持向量机的兼职蛋白识别方法。
Front Genet. 2022 Aug 15;13:963349. doi: 10.3389/fgene.2022.963349. eCollection 2022.
7
IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models.IdentPMP:使用基于序列的学习模型鉴定植物中的兼职蛋白
PeerJ. 2021 Aug 6;9:e11900. doi: 10.7717/peerj.11900. eCollection 2021.
8
Enzyme classification using multiclass support vector machine and feature subset selection.使用多类支持向量机和特征子集选择进行酶分类。
Comput Biol Chem. 2017 Oct;70:211-219. doi: 10.1016/j.compbiolchem.2017.08.009. Epub 2017 Aug 31.
9
Prediction of oxidoreductase subfamily classes based on RFE-SND-CC-PSSM and machine learning methods.基于RFE-SND-CC-PSSM和机器学习方法的氧化还原酶亚家族类别的预测
J Bioinform Comput Biol. 2019 Aug;17(4):1950029. doi: 10.1142/S021972001950029X.
10
Early Detection of Hypotension Using a Multivariate Machine Learning Approach.使用多元机器学习方法进行低血压的早期检测。
Mil Med. 2021 Jan 25;186(Suppl 1):440-444. doi: 10.1093/milmed/usaa323.

引用本文的文献

1
Moonlighting enzymes: when cellular context defines specificity.兼职酶:当细胞环境决定特异性时。
Cell Mol Life Sci. 2023 Apr 24;80(5):130. doi: 10.1007/s00018-023-04781-0.
2
Computational method for aromatase-related proteins using machine learning approach.基于机器学习的芳香化酶相关蛋白计算方法。
PLoS One. 2023 Mar 29;18(3):e0283567. doi: 10.1371/journal.pone.0283567. eCollection 2023.
3
Predictive modeling of moonlighting DNA-binding proteins.兼职DNA结合蛋白的预测建模
NAR Genom Bioinform. 2022 Dec 2;4(4):lqac091. doi: 10.1093/nargab/lqac091. eCollection 2022 Dec.
4
A method for identifying moonlighting proteins based on linear discriminant analysis and bagging-SVM.一种基于线性判别分析和装袋支持向量机的兼职蛋白识别方法。
Front Genet. 2022 Aug 15;13:963349. doi: 10.3389/fgene.2022.963349. eCollection 2022.
5
The Minimal Translation Machinery: What We Can Learn From Naturally and Experimentally Reduced Genomes.最小翻译机制:我们能从自然和实验性简化基因组中学到什么。
Front Microbiol. 2022 Apr 11;13:858983. doi: 10.3389/fmicb.2022.858983. eCollection 2022.
6
Moonlighting in Rickettsiales: Expanding Virulence Landscape.立克次氏体的兼职行为:扩展毒力格局
Trop Med Infect Dis. 2022 Feb 19;7(2):32. doi: 10.3390/tropicalmed7020032.
7
Correction to: Moonlighting protein prediction using physico‑chemical and evolutional properties via machine learning methods.对《通过机器学习方法利用物理化学和进化特性预测兼职蛋白》的修正
BMC Bioinformatics. 2021 Jul 9;22(1):366. doi: 10.1186/s12859-021-04257-7.