Suppr超能文献

短线性基序特征在钙调蛋白结合蛋白预测中的预测性能。

The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins.

机构信息

School of Computer Science, University of Windsor, Windsor, Ontario, Canada.

Inst. of Env. Health Sci., Wayne State University, Detroit, MI, USA.

出版信息

BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):410. doi: 10.1186/s12859-018-2378-9.

Abstract

BACKGROUND

The prediction of calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because the calmodulin protein binds and regulates a multitude of protein targets affecting different cellular processes. Computational methods that can accurately identify CaM-binding proteins and CaM-binding domains would accelerate research in calcium signaling and calmodulin function. Short-linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been utilized in the prediction of CaM-binding proteins.

RESULTS

We propose a new method for the prediction of CaM-binding proteins based on both the total and average scores of known and new SLiMs in protein sequences using a new scoring method called sliding window scoring (SWS) as features for the prediction module. A dataset of 194 manually curated human CaM-binding proteins and 193 mitochondrial proteins have been obtained and used for testing the proposed model. The motif generation tool, Multiple EM for Motif Elucidation (MEME), has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with random forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbors (k-NN), support vector machines (SVM), naive Bayes (NB) and random forest (RF).

CONCLUSIONS

Our proposed method shows very good prediction results and demonstrates how information contained in SLiMs is highly relevant in predicting CaM-binding proteins. Further, three new CaM-binding motifs have been computationally selected and biologically validated in this study, and which can be used for predicting CaM-binding proteins.

摘要

背景

钙调蛋白结合(CaM-结合)蛋白的预测在生物学和生物化学领域中起着非常重要的作用,因为钙调蛋白蛋白结合并调节了许多影响不同细胞过程的蛋白质靶标。能够准确识别 CaM-结合蛋白和 CaM-结合结构域的计算方法将加速钙信号和钙调蛋白功能的研究。另一方面,短线性基序(SLiMs)已被有效地用作分析蛋白质-蛋白质相互作用的特征,尽管它们的性质尚未在 CaM-结合蛋白的预测中得到利用。

结果

我们提出了一种新的方法,该方法基于使用新的评分方法称为滑动窗口评分(SWS)作为预测模块的特征,对蛋白质序列中已知和新的 SLiMs 的总评分和平均评分来预测 CaM-结合蛋白。已经获得并使用了一个由 194 个人工编辑的人类 CaM-结合蛋白和 193 个线粒体蛋白组成的数据集来测试所提出的模型。 motif 生成工具,多模体解析的多重 EM(MEME),已被用于从每个阳性和阴性数据集(SM 方法)以及从组合的阴性和阳性数据集(CM 方法)中单独获得新的 motif。此外,已经应用了带有随机森林的包装器准则进行特征选择(FS),然后使用不同的算法(如 k-最近邻(k-NN)、支持向量机(SVM)、朴素贝叶斯(NB)和随机森林(RF))进行分类。

结论

我们提出的方法显示出非常好的预测结果,并证明了 SLiMs 中包含的信息在预测 CaM-结合蛋白方面是非常相关的。此外,在这项研究中,已经计算选择了三个新的 CaM-结合基序,并进行了生物验证,可用于预测 CaM-结合蛋白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bd3/6245490/e060de7f6a89/12859_2018_2378_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验