• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

真菌分泌途径中蛋白质相互作用的机器学习

Machine Learning of Protein Interactions in Fungal Secretory Pathways.

作者信息

Kludas Jana, Arvas Mikko, Castillo Sandra, Pakula Tiina, Oja Merja, Brouard Céline, Jäntti Jussi, Penttilä Merja, Rousu Juho

机构信息

Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.

VTT Technical Research Centre of Finland, Espoo, Finland.

出版信息

PLoS One. 2016 Jul 21;11(7):e0159302. doi: 10.1371/journal.pone.0159302. eCollection 2016.

DOI:10.1371/journal.pone.0159302
PMID:27441920
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4956264/
Abstract

In this paper we apply machine learning methods for predicting protein interactions in fungal secretion pathways. We assume an inter-species transfer setting, where training data is obtained from a single species and the objective is to predict protein interactions in other, related species. In our methodology, we combine several state of the art machine learning approaches, namely, multiple kernel learning (MKL), pairwise kernels and kernelized structured output prediction in the supervised graph inference framework. For MKL, we apply recently proposed centered kernel alignment and p-norm path following approaches to integrate several feature sets describing the proteins, demonstrating improved performance. For graph inference, we apply input-output kernel regression (IOKR) in supervised and semi-supervised modes as well as output kernel trees (OK3). In our experiments simulating increasing genetic distance, Input-Output Kernel Regression proved to be the most robust prediction approach. We also show that the MKL approaches improve the predictions compared to uniform combination of the kernels. We evaluate the methods on the task of predicting protein-protein-interactions in the secretion pathways in fungi, S.cerevisiae, baker's yeast, being the source, T. reesei being the target of the inter-species transfer learning. We identify completely novel candidate secretion proteins conserved in filamentous fungi. These proteins could contribute to their unique secretion capabilities.

摘要

在本文中,我们应用机器学习方法来预测真菌分泌途径中的蛋白质相互作用。我们假设一种跨物种转移的情况,即训练数据从单个物种获取,目标是预测其他相关物种中的蛋白质相互作用。在我们的方法中,我们在监督图推理框架中结合了几种先进的机器学习方法,即多核学习(MKL)、成对核和核结构化输出预测。对于MKL,我们应用最近提出的中心核对齐和p范数路径跟踪方法来整合描述蛋白质的几个特征集,证明性能有所提高。对于图推理,我们在监督和半监督模式下应用输入-输出核回归(IOKR)以及输出核树(OK3)。在我们模拟遗传距离增加的实验中,输入-输出核回归被证明是最稳健的预测方法。我们还表明,与核的均匀组合相比,MKL方法改进了预测。我们在预测真菌分泌途径中蛋白质-蛋白质相互作用的任务上评估这些方法,酿酒酵母作为跨物种转移学习的源物种,里氏木霉作为目标物种。我们鉴定出在丝状真菌中保守的全新候选分泌蛋白。这些蛋白质可能有助于它们独特的分泌能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/c383ae43d764/pone.0159302.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/e5435ceec272/pone.0159302.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/8a72d55ec45d/pone.0159302.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/730ce5a49dd2/pone.0159302.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/f2dc5c542ec3/pone.0159302.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/22b5e4233fdc/pone.0159302.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/c0881c975e8f/pone.0159302.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/c383ae43d764/pone.0159302.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/e5435ceec272/pone.0159302.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/8a72d55ec45d/pone.0159302.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/730ce5a49dd2/pone.0159302.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/f2dc5c542ec3/pone.0159302.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/22b5e4233fdc/pone.0159302.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/c0881c975e8f/pone.0159302.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9c/4956264/c383ae43d764/pone.0159302.g007.jpg

相似文献

1
Machine Learning of Protein Interactions in Fungal Secretory Pathways.真菌分泌途径中蛋白质相互作用的机器学习
PLoS One. 2016 Jul 21;11(7):e0159302. doi: 10.1371/journal.pone.0159302. eCollection 2016.
2
A probabilistic graph-theoretic approach to integrate multiple predictions for the protein-protein subnetwork prediction challenge.一种用于整合蛋白质-蛋白质子网预测挑战的多个预测的概率图论方法。
Ann N Y Acad Sci. 2009 Mar;1158:224-33. doi: 10.1111/j.1749-6632.2008.03760.x.
3
Kernel methods for predicting protein-protein interactions.用于预测蛋白质-蛋白质相互作用的核方法。
Bioinformatics. 2005 Jun;21 Suppl 1:i38-46. doi: 10.1093/bioinformatics/bti1016.
4
Common features and interesting differences in transcriptional responses to secretion stress in the fungi Trichoderma reesei and Saccharomyces cerevisiae.里氏木霉和酿酒酵母对分泌应激转录反应中的共同特征和有趣差异。
BMC Genomics. 2006 Feb 22;7:32. doi: 10.1186/1471-2164-7-32.
5
Completing sparse and disconnected protein-protein network by deep learning.通过深度学习填补稀疏且不连续的蛋白质-蛋白质网络。
BMC Bioinformatics. 2018 Mar 22;19(1):103. doi: 10.1186/s12859-018-2112-7.
6
Characterization of secretory genes ypt1/yptA and nsf1/nsfA from two filamentous fungi: induction of secretory pathway genes of Trichoderma reesei under secretion stress conditions.两种丝状真菌分泌基因ypt1/yptA和nsf1/nsfA的特性:里氏木霉分泌途径基因在分泌应激条件下的诱导
Appl Environ Microbiol. 2004 Jan;70(1):459-67. doi: 10.1128/AEM.70.1.459-467.2004.
7
Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest.使用一种新颖的多尺度局部特征表示方案和随机森林从蛋白质一级序列预测蛋白质-蛋白质相互作用。
PLoS One. 2015 May 6;10(5):e0125811. doi: 10.1371/journal.pone.0125811. eCollection 2015.
8
Kernelized Bayesian Matrix Factorization.核化贝叶斯矩阵分解。
IEEE Trans Pattern Anal Mach Intell. 2014 Oct;36(10):2047-60. doi: 10.1109/TPAMI.2014.2313125.
9
A new pairwise kernel for biological network inference with support vector machines.一种用于支持向量机生物网络推理的新型成对核。
BMC Bioinformatics. 2007;8 Suppl 10(Suppl 10):S8. doi: 10.1186/1471-2105-8-S10-S8.
10
Activation mechanisms of the HAC1-mediated unfolded protein response in filamentous fungi.丝状真菌中HAC1介导的未折叠蛋白反应的激活机制。
Mol Microbiol. 2003 Feb;47(4):1149-61. doi: 10.1046/j.1365-2958.2003.03363.x.

引用本文的文献

1
Learning with multiple pairwise kernels for drug bioactivity prediction.使用多种成对核函数进行药物生物活性预测。
Bioinformatics. 2018 Jul 1;34(13):i509-i518. doi: 10.1093/bioinformatics/bty277.

本文引用的文献

1
PANTHER version 10: expanded protein families and functions, and analysis tools.PANTHER 版本 10:扩展的蛋白质家族与功能以及分析工具。
Nucleic Acids Res. 2016 Jan 4;44(D1):D336-42. doi: 10.1093/nar/gkv1194. Epub 2015 Nov 17.
2
PANNZER: high-throughput functional annotation of uncharacterized proteins in an error-prone environment.PANNZER:在易出错环境中对未表征蛋白质进行高通量功能注释。
Bioinformatics. 2015 May 15;31(10):1544-52. doi: 10.1093/bioinformatics/btu851. Epub 2015 Jan 8.
3
STRING v10: protein-protein interaction networks, integrated over the tree of life.
STRING v10:整合了整个生命之树的蛋白质-蛋白质相互作用网络。
Nucleic Acids Res. 2015 Jan;43(Database issue):D447-52. doi: 10.1093/nar/gku1003. Epub 2014 Oct 28.
4
UniProt: a hub for protein information.通用蛋白质数据库(UniProt):蛋白质信息中心。
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.
5
SMART: recent updates, new developments and status in 2015.SMART:2015年的近期更新、新进展及现状
Nucleic Acids Res. 2015 Jan;43(Database issue):D257-60. doi: 10.1093/nar/gku949. Epub 2014 Oct 9.
6
nDNA-Prot: identification of DNA-binding proteins based on unbalanced classification.nDNA-Prot:基于不平衡分类的 DNA 结合蛋白识别。
BMC Bioinformatics. 2014 Sep 8;15(1):298. doi: 10.1186/1471-2105-15-298.
7
Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.在不使用参考基因组的情况下,对复杂宏基因组样本中的基因组和遗传元件进行鉴定和组装。
Nat Biotechnol. 2014 Aug;32(8):822-8. doi: 10.1038/nbt.2939. Epub 2014 Jul 6.
8
Comparative genome-scale reconstruction of gapless metabolic networks for present and ancestral species.当前物种和祖先物种无间隙代谢网络的比较基因组规模重建。
PLoS Comput Biol. 2014 Feb 6;10(2):e1003465. doi: 10.1371/journal.pcbi.1003465. eCollection 2014 Feb.
9
InterProScan 5: genome-scale protein function classification.InterProScan 5:基因组规模的蛋白质功能分类。
Bioinformatics. 2014 May 1;30(9):1236-40. doi: 10.1093/bioinformatics/btu031. Epub 2014 Jan 21.
10
On protocols and measures for the validation of supervised methods for the inference of biological networks.关于生物网络推断监督方法验证的协议与措施
Front Genet. 2013 Dec 3;4:262. doi: 10.3389/fgene.2013.00262.