短线性基序特征在钙调蛋白结合蛋白预测中的预测性能。

The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins.

机构信息

School of Computer Science, University of Windsor, Windsor, Ontario, Canada.

Inst. of Env. Health Sci., Wayne State University, Detroit, MI, USA.

出版信息

BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):410. doi: 10.1186/s12859-018-2378-9.

DOI:10.1186/s12859-018-2378-9

PMID:30453876

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6245490/

Abstract

BACKGROUND

The prediction of calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because the calmodulin protein binds and regulates a multitude of protein targets affecting different cellular processes. Computational methods that can accurately identify CaM-binding proteins and CaM-binding domains would accelerate research in calcium signaling and calmodulin function. Short-linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been utilized in the prediction of CaM-binding proteins.

RESULTS

We propose a new method for the prediction of CaM-binding proteins based on both the total and average scores of known and new SLiMs in protein sequences using a new scoring method called sliding window scoring (SWS) as features for the prediction module. A dataset of 194 manually curated human CaM-binding proteins and 193 mitochondrial proteins have been obtained and used for testing the proposed model. The motif generation tool, Multiple EM for Motif Elucidation (MEME), has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with random forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbors (k-NN), support vector machines (SVM), naive Bayes (NB) and random forest (RF).

CONCLUSIONS

Our proposed method shows very good prediction results and demonstrates how information contained in SLiMs is highly relevant in predicting CaM-binding proteins. Further, three new CaM-binding motifs have been computationally selected and biologically validated in this study, and which can be used for predicting CaM-binding proteins.

摘要

背景

钙调蛋白结合（CaM-结合）蛋白的预测在生物学和生物化学领域中起着非常重要的作用，因为钙调蛋白蛋白结合并调节了许多影响不同细胞过程的蛋白质靶标。能够准确识别 CaM-结合蛋白和 CaM-结合结构域的计算方法将加速钙信号和钙调蛋白功能的研究。另一方面，短线性基序（SLiMs）已被有效地用作分析蛋白质-蛋白质相互作用的特征，尽管它们的性质尚未在 CaM-结合蛋白的预测中得到利用。

结果

我们提出了一种新的方法，该方法基于使用新的评分方法称为滑动窗口评分（SWS）作为预测模块的特征，对蛋白质序列中已知和新的 SLiMs 的总评分和平均评分来预测 CaM-结合蛋白。已经获得并使用了一个由 194 个人工编辑的人类 CaM-结合蛋白和 193 个线粒体蛋白组成的数据集来测试所提出的模型。 motif 生成工具，多模体解析的多重 EM（MEME），已被用于从每个阳性和阴性数据集（SM 方法）以及从组合的阴性和阳性数据集（CM 方法）中单独获得新的 motif。此外，已经应用了带有随机森林的包装器准则进行特征选择（FS），然后使用不同的算法（如 k-最近邻（k-NN）、支持向量机（SVM）、朴素贝叶斯（NB）和随机森林（RF））进行分类。

结论

我们提出的方法显示出非常好的预测结果，并证明了 SLiMs 中包含的信息在预测 CaM-结合蛋白方面是非常相关的。此外，在这项研究中，已经计算选择了三个新的 CaM-结合基序，并进行了生物验证，可用于预测 CaM-结合蛋白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bd3/6245490/e060de7f6a89/12859_2018_2378_Fig1_HTML.jpg

相似文献

The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins.短线性基序特征在钙调蛋白结合蛋白预测中的预测性能。

BMC Bioinformatics. 2018 Nov 20;19(Suppl 14):410. doi: 10.1186/s12859-018-2378-9.

Structural Aspects and Prediction of Calmodulin-Binding Proteins.钙调蛋白结合蛋白的结构特征与预测。

Int J Mol Sci. 2020 Dec 30;22(1):308. doi: 10.3390/ijms22010308.

CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.CaMELS：钙调蛋白结合蛋白及其结合位点的计算机模拟预测

Proteins. 2017 Sep;85(9):1724-1740. doi: 10.1002/prot.25330. Epub 2017 Jul 3.

Multiple instance learning of Calmodulin binding sites.钙调蛋白结合位点的多重实例学习。

Bioinformatics. 2012 Sep 15;28(18):i416-i422. doi: 10.1093/bioinformatics/bts416.

Computational prediction of short linear motifs from protein sequences.从蛋白质序列中对短线性基序进行计算预测。

Methods Mol Biol. 2015;1268:89-141. doi: 10.1007/978-1-4939-2285-7_6.

Bioinformatics Approaches for Predicting Disordered Protein Motifs.预测无序蛋白质基序的生物信息学方法

Adv Exp Med Biol. 2015;870:291-318. doi: 10.1007/978-3-319-20164-1_9.

The neuronal voltage-dependent sodium channel type II IQ motif lowers the calcium affinity of the C-domain of calmodulin.神经元电压依赖性 II 型钠通道的 IQ 模体降低了钙调蛋白 C 结构域的钙亲和力。

Biochemistry. 2008 Jan 8;47(1):112-23. doi: 10.1021/bi7013129. Epub 2007 Dec 8.

IAMPE: NMR-Assisted Computational Prediction of Antimicrobial Peptides.IAMPE：基于 NMR 的抗菌肽计算预测。

J Chem Inf Model. 2020 Oct 26;60(10):4691-4701. doi: 10.1021/acs.jcim.0c00841. Epub 2020 Sep 30.

Computational methods for ubiquitination site prediction using physicochemical properties of protein sequences.利用蛋白质序列的物理化学性质进行泛素化位点预测的计算方法。

BMC Bioinformatics. 2016 Mar 3;17:116. doi: 10.1186/s12859-016-0959-z.

Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines.基于最近邻算法和支持向量机的不平衡数据在人类乳腺癌和结肠癌预测中的应用。

Comput Methods Programs Biomed. 2014 Mar;113(3):792-808. doi: 10.1016/j.cmpb.2014.01.001. Epub 2014 Jan 10.

引用本文的文献

A functionally divergent intrinsically disordered region underlying the conservation of stochastic signaling.一个功能上不同的无规则区域是随机信号保守性的基础。

PLoS Genet. 2021 Sep 10;17(9):e1009629. doi: 10.1371/journal.pgen.1009629. eCollection 2021 Sep.

Structural Aspects and Prediction of Calmodulin-Binding Proteins.钙调蛋白结合蛋白的结构特征与预测。

Int J Mol Sci. 2020 Dec 30;22(1):308. doi: 10.3390/ijms22010308.

Calmodulin-mediated events during the life cycle of the amoebozoan Dictyostelium discoideum.钙调蛋白介导的变形虫盘基网柄菌生活史中的事件。

Biol Rev Camb Philos Soc. 2020 Apr;95(2):472-490. doi: 10.1111/brv.12573. Epub 2019 Nov 26.

The Crossroad of Ion Channels and Calmodulin in Disease.离子通道与钙调蛋白在疾病中的交汇

Int J Mol Sci. 2019 Jan 18;20(2):400. doi: 10.3390/ijms20020400.

本文引用的文献

Prediction of virus-host protein-protein interactions mediated by short linear motifs.由短线性基序介导的病毒-宿主蛋白质-蛋白质相互作用的预测

BMC Bioinformatics. 2017 Mar 9;18(1):163. doi: 10.1186/s12859-017-1570-7.

Computational Framework for Prediction of Peptide Sequences That May Mediate Multiple Protein Interactions in Cancer-Associated Hub Proteins.预测可能介导癌症相关枢纽蛋白中多种蛋白质相互作用的肽序列的计算框架

PLoS One. 2016 May 24;11(5):e0155911. doi: 10.1371/journal.pone.0155911. eCollection 2016.

UniProt: a hub for protein information.通用蛋白质数据库（UniProt）：蛋白质信息中心。

Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.

Calmodulation meta-analysis: predicting calmodulin binding via canonical motif clustering.钙调蛋白元分析：通过典型基序聚类预测钙调蛋白结合

J Gen Physiol. 2014 Jul;144(1):105-14. doi: 10.1085/jgp.201311140. Epub 2014 Jun 16.

Profile-based short linear protein motif discovery.基于轮廓的短线性蛋白质基序发现。

BMC Bioinformatics. 2012 May 18;13:104. doi: 10.1186/1471-2105-13-104.

Minimotif Miner 3.0: database expansion and significantly improved reduction of false-positive predictions from consensus sequences.Minimotif Miner 3.0：数据库扩展和从共识序列显著减少假阳性预测。

Nucleic Acids Res. 2012 Jan;40(Database issue):D252-60. doi: 10.1093/nar/gkr1189. Epub 2011 Dec 6.

SLiMSearch 2.0: biological context for short linear motifs in proteins.SLiMSearch 2.0：蛋白质中短线性基序的生物学背景。

Nucleic Acids Res. 2011 Jul;39(Web Server issue):W56-60. doi: 10.1093/nar/gkr402. Epub 2011 May 26.

SLiMFinder: a web server to find novel, significantly over-represented, short protein motifs.SLiMFinder：一个用于发现新颖的、显著过度表达的短蛋白基序的网络服务器。

Nucleic Acids Res. 2010 Jul;38(Web Server issue):W534-9. doi: 10.1093/nar/gkq440. Epub 2010 May 23.

The conservation pattern of short linear motifs is highly correlated with the function of interacting protein domains.短线性基序的保守模式与相互作用蛋白结构域的功能高度相关。

BMC Genomics. 2008 Oct 1;9:452. doi: 10.1186/1471-2164-9-452.

A review of feature selection techniques in bioinformatics.生物信息学中特征选择技术综述。

Bioinformatics. 2007 Oct 1;23(19):2507-17. doi: 10.1093/bioinformatics/btm344. Epub 2007 Aug 24.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

短线性基序特征在钙调蛋白结合蛋白预测中的预测性能。

The predictive performance of short-linear motif features in the prediction of calmodulin-binding proteins.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献