• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用多种蛋白质关联信息对兼职蛋白进行全基因组规模预测。

Genome-scale prediction of moonlighting proteins using diverse protein association information.

作者信息

Khan Ishita K, Kihara Daisuke

机构信息

Department of Computer Science.

Department of Computer Science Department of Biological Science, Purdue University, West Lafayette, IN, USA.

出版信息

Bioinformatics. 2016 Aug 1;32(15):2281-8. doi: 10.1093/bioinformatics/btw166. Epub 2016 Mar 26.

DOI:10.1093/bioinformatics/btw166
PMID:27153604
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4965633/
Abstract

MOTIVATION

Moonlighting proteins (MPs) show multiple cellular functions within a single polypeptide chain. To understand the overall landscape of their functional diversity, it is important to establish a computational method that can identify MPs on a genome scale. Previously, we have systematically characterized MPs using functional and omics-scale information. In this work, we develop a computational prediction model for automatic identification of MPs using a diverse range of protein association information.

RESULTS

We incorporated a diverse range of protein association information to extract characteristic features of MPs, which range from gene ontology (GO), protein-protein interactions, gene expression, phylogenetic profiles, genetic interactions and network-based graph properties to protein structural properties, i.e. intrinsically disordered regions in the protein chain. Then, we used machine learning classifiers using the broad feature space for predicting MPs. Because many known MPs lack some proteomic features, we developed an imputation technique to fill such missing features. Results on the control dataset show that MPs can be predicted with over 98% accuracy when GO terms are available. Furthermore, using only the omics-based features the method can still identify MPs with over 75% accuracy. Last, we applied the method on three genomes: Saccharomyces cerevisiae, Caenorhabditis elegans and Homo sapiens, and found that about 2-10% of proteins in the genomes are potential MPs.

AVAILABILITY AND IMPLEMENTATION

Code available at http://kiharalab.org/MPprediction

CONTACT

dkihara@purdue.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

兼性蛋白质(MPs)在单一多肽链中展现出多种细胞功能。为了解其功能多样性的整体格局,建立一种能够在基因组规模上识别MPs的计算方法至关重要。此前,我们已利用功能和组学规模的信息对MPs进行了系统表征。在这项工作中,我们开发了一种计算预测模型,用于利用多种蛋白质关联信息自动识别MPs。

结果

我们整合了多种蛋白质关联信息,以提取MPs的特征,这些信息范围从基因本体(GO)、蛋白质-蛋白质相互作用、基因表达、系统发育谱、遗传相互作用和基于网络的图属性到蛋白质结构属性,即蛋白质链中的内在无序区域。然后,我们使用机器学习分类器,利用广泛的特征空间来预测MPs。由于许多已知的MPs缺乏一些蛋白质组学特征,我们开发了一种插补技术来填补这些缺失的特征。对照数据集的结果表明,当有GO术语可用时,MPs的预测准确率超过98%。此外,仅使用基于组学的特征,该方法仍能以超过75%的准确率识别MPs。最后,我们将该方法应用于三个基因组:酿酒酵母、秀丽隐杆线虫和智人,发现基因组中约2-10%的蛋白质是潜在的MPs。

可用性和实现方式

代码可在http://kiharalab.org/MPprediction获取

联系方式

dkihara@purdue.edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
Genome-scale prediction of moonlighting proteins using diverse protein association information.利用多种蛋白质关联信息对兼职蛋白进行全基因组规模预测。
Bioinformatics. 2016 Aug 1;32(15):2281-8. doi: 10.1093/bioinformatics/btw166. Epub 2016 Mar 26.
2
DextMP: deep dive into text for predicting moonlighting proteins.DextMP:深入挖掘文本以预测兼职蛋白。
Bioinformatics. 2017 Jul 15;33(14):i83-i91. doi: 10.1093/bioinformatics/btx231.
3
Moonlighting protein prediction using physico-chemical and evolutional properties via machine learning methods.利用物理化学和进化特性通过机器学习方法进行兼职蛋白质预测。
BMC Bioinformatics. 2021 May 24;22(1):261. doi: 10.1186/s12859-021-04194-5.
4
Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.Hum-mPLoc 3.0:通过对基因本体和功能域特征的隐藏相关性进行建模来增强人类蛋白质亚细胞定位预测
Bioinformatics. 2017 Mar 15;33(6):843-853. doi: 10.1093/bioinformatics/btw723.
5
Genome-scale identification and characterization of moonlighting proteins.兼职蛋白的全基因组规模鉴定与表征
Biol Direct. 2014 Dec 11;9:30. doi: 10.1186/s13062-014-0030-9.
6
GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank.GOLabeler:通过学习排序提高基于序列的大规模蛋白质功能预测。
Bioinformatics. 2018 Jul 15;34(14):2465-2473. doi: 10.1093/bioinformatics/bty130.
7
Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning.使用多模态深度集成学习预测兼职蛋白
Front Genet. 2021 Mar 22;12:630379. doi: 10.3389/fgene.2021.630379. eCollection 2021.
8
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.DeepGO:使用深度本体感知分类器从序列和相互作用预测蛋白质功能。
Bioinformatics. 2018 Feb 15;34(4):660-668. doi: 10.1093/bioinformatics/btx624.
9
The use of Gene Ontology terms for predicting highly-connected 'hub' nodes in protein-protein interaction networks.利用基因本体术语预测蛋白质-蛋白质相互作用网络中高度连接的“枢纽”节点。
BMC Syst Biol. 2008 Sep 16;2:80. doi: 10.1186/1752-0509-2-80.
10
Information theory applied to the sparse gene ontology annotation network to predict novel gene function.信息论应用于稀疏基因本体注释网络以预测新的基因功能。
Bioinformatics. 2007 Jul 1;23(13):i529-38. doi: 10.1093/bioinformatics/btm195.

引用本文的文献

1
Genome-wide identification and characterization of FORMIN gene family in cotton (Gossypium hirsutum L.) and their expression profiles in response to multiple abiotic stress treatments.棉花(陆地棉)中FORMIN基因家族的全基因组鉴定与特征分析及其对多种非生物胁迫处理的表达谱
PLoS One. 2025 Mar 3;20(3):e0319176. doi: 10.1371/journal.pone.0319176. eCollection 2025.
2
Early detection of abiotic stress in plants through SNARE proteins using hybrid feature fusion model.利用混合特征融合模型通过SNARE蛋白早期检测植物中的非生物胁迫。
PeerJ Comput Sci. 2024 Aug 5;10:e2149. doi: 10.7717/peerj-cs.2149. eCollection 2024.
3
Role of Moonlighting Proteins in Disease: Analyzing the Contribution of Canonical and Moonlighting Functions in Disease Progression.蛋白质的双重功能在疾病中的作用:分析规范功能和双重功能在疾病进展中的贡献。
Cells. 2023 Jan 5;12(2):235. doi: 10.3390/cells12020235.
4
Predictive modeling of moonlighting DNA-binding proteins.兼职DNA结合蛋白的预测建模
NAR Genom Bioinform. 2022 Dec 2;4(4):lqac091. doi: 10.1093/nargab/lqac091. eCollection 2022 Dec.
5
A method for identifying moonlighting proteins based on linear discriminant analysis and bagging-SVM.一种基于线性判别分析和装袋支持向量机的兼职蛋白识别方法。
Front Genet. 2022 Aug 15;13:963349. doi: 10.3389/fgene.2022.963349. eCollection 2022.
6
SIN-3 functions through multi-protein interaction to regulate apoptosis, autophagy, and longevity in Caenorhabditis elegans.SIN-3 通过多蛋白相互作用调节秀丽隐杆线虫的细胞凋亡、自噬和寿命。
Sci Rep. 2022 Jun 22;12(1):10560. doi: 10.1038/s41598-022-13864-0.
7
Prediction of Protein-Protein Interactions in , Maize, and Rice by Combining Deep Neural Network With Discrete Hilbert Transform.通过结合深度神经网络与离散希尔伯特变换预测玉米和水稻中的蛋白质-蛋白质相互作用
Front Genet. 2021 Sep 20;12:745228. doi: 10.3389/fgene.2021.745228. eCollection 2021.
8
Neighborhood watch: tools for defining locale-dependent subproteomes and their contextual signaling activities.邻里守望:用于定义局部依赖性亚蛋白质组及其上下文信号活动的工具。
RSC Chem Biol. 2020 May 27;1(2):42-55. doi: 10.1039/d0cb00041h. eCollection 2020 Jun 1.
9
IdentPMP: identification of moonlighting proteins in plants using sequence-based learning models.IdentPMP:使用基于序列的学习模型鉴定植物中的兼职蛋白
PeerJ. 2021 Aug 6;9:e11900. doi: 10.7717/peerj.11900. eCollection 2021.
10
Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning.使用多模态深度集成学习预测兼职蛋白
Front Genet. 2021 Mar 22;12:630379. doi: 10.3389/fgene.2021.630379. eCollection 2021.

本文引用的文献

1
Genome-Wide Detection and Analysis of Multifunctional Genes.全基因组多功能基因的检测与分析
PLoS Comput Biol. 2015 Oct 5;11(10):e1004467. doi: 10.1371/journal.pcbi.1004467. eCollection 2015 Oct.
2
Extreme multifunctional proteins identified from a human protein interaction network.从人类蛋白质相互作用网络中鉴定出的极端多功能蛋白质。
Nat Commun. 2015 Jun 9;6:7412. doi: 10.1038/ncomms8412.
3
Genome-scale identification and characterization of moonlighting proteins.兼职蛋白的全基因组规模鉴定与表征
Biol Direct. 2014 Dec 11;9:30. doi: 10.1186/s13062-014-0030-9.
4
Computational characterization of moonlighting proteins.兼职蛋白的计算表征
Biochem Soc Trans. 2014 Dec;42(6):1780-5. doi: 10.1042/BST20140214.
5
COXPRESdb in 2015: coexpression database for animal species by DNA-microarray and RNAseq-based expression data with multiple quality assessment systems.2015年的COXPRESdb:基于DNA微阵列和RNA测序的表达数据、带有多个质量评估系统的动物物种共表达数据库。
Nucleic Acids Res. 2015 Jan;43(Database issue):D82-6. doi: 10.1093/nar/gku1163. Epub 2014 Nov 11.
6
STRING v10: protein-protein interaction networks, integrated over the tree of life.STRING v10:整合了整个生命之树的蛋白质-蛋白质相互作用网络。
Nucleic Acids Res. 2015 Jan;43(Database issue):D447-52. doi: 10.1093/nar/gku1003. Epub 2014 Oct 28.
7
MoonProt: a database for proteins that are known to moonlight.MoonProt:一个关于已知具有兼职功能蛋白质的数据库。
Nucleic Acids Res. 2015 Jan;43(Database issue):D277-82. doi: 10.1093/nar/gku954. Epub 2014 Oct 16.
8
Pfam: the protein families database.Pfam:蛋白质家族数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 Nov 27.
9
Activities at the Universal Protein Resource (UniProt).通用蛋白质资源库(UniProt)的活动。
Nucleic Acids Res. 2014 Jan;42(Database issue):D191-8. doi: 10.1093/nar/gkt1140. Epub 2013 Nov 18.
10
MultitaskProtDB: a database of multitasking proteins.多任务蛋白数据库:一个多任务蛋白数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D517-20. doi: 10.1093/nar/gkt1153. Epub 2013 Nov 18.