Suppr超能文献

预测蛋白质的 SUMO 化位点从序列特征。

Predicting protein sumoylation sites from sequence features.

机构信息

Department of Genetics and Biochemistry, Clemson University, Clemson, SC 29634, USA.

出版信息

Amino Acids. 2012 Jul;43(1):447-55. doi: 10.1007/s00726-011-1100-2. Epub 2011 Oct 7.

Abstract

Protein sumoylation is a post-translational modification that plays an important role in a wide range of cellular processes. Small ubiquitin-related modifier (SUMO) can be covalently and reversibly conjugated to the sumoylation sites of target proteins, many of which are implicated in various human genetic disorders. The accurate prediction of protein sumoylation sites may help biomedical researchers to design their experiments and understand the molecular mechanism of protein sumoylation. In this study, a new machine learning approach has been developed for predicting sumoylation sites from protein sequence information. Random forests (RFs) and support vector machines (SVMs) were trained with the data collected from the literature. Domain-specific knowledge in terms of relevant biological features was used for input vector encoding. It was shown that RF classifier performance was affected by the sequence context of sumoylation sites, and 20 residues with the core motif ΨKXE in the middle appeared to provide enough context information for sumoylation site prediction. The RF classifiers were also found to outperform SVM models for predicting protein sumoylation sites from sequence features. The results suggest that the machine learning approach gives rise to more accurate prediction of protein sumoylation sites than the other existing methods. The accurate classifiers have been used to develop a new web server, called seeSUMO (http://bioinfo.ggc.org/seesumo/), for sequence-based prediction of protein sumoylation sites.

摘要

蛋白质 SUMO 化是一种翻译后修饰,在广泛的细胞过程中发挥着重要作用。小泛素相关修饰物(SUMO)可以共价且可逆地连接到靶蛋白的 SUMO 化位点,其中许多蛋白与各种人类遗传疾病有关。准确预测蛋白质 SUMO 化位点可以帮助生物医学研究人员设计实验并理解蛋白质 SUMO 化的分子机制。在这项研究中,开发了一种新的机器学习方法,用于从蛋白质序列信息中预测 SUMO 化位点。随机森林(RF)和支持向量机(SVM)使用从文献中收集的数据进行训练。使用相关生物特征的领域特定知识进行输入向量编码。结果表明,RF 分类器的性能受到 SUMO 化位点序列上下文的影响,并且中间具有核心模体 ΨKXE 的 20 个残基似乎为 SUMO 化位点预测提供了足够的上下文信息。还发现 RF 分类器在基于序列特征预测蛋白质 SUMO 化位点方面优于 SVM 模型。结果表明,与其他现有方法相比,机器学习方法可以更准确地预测蛋白质 SUMO 化位点。准确的分类器已被用于开发一个新的基于序列的蛋白质 SUMO 化位点预测的网络服务器,称为 seeSUMO(http://bioinfo.ggc.org/seesumo/)。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验