基于聚类的 PDZ 肽相互作用预测。

Cluster based prediction of PDZ-peptide interactions.

出版信息

BMC Genomics. 2014;15 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2164-15-S1-S5. Epub 2014 Jan 24.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4046824/

Abstract

BACKGROUND

PDZ domains are one of the most promiscuous protein recognition modules that bind with short linear peptides and play an important role in cellular signaling. Recently, few high-throughput techniques (e.g. protein microarray screen, phage display) have been applied to determine in-vitro binding specificity of PDZ domains. Currently, many computational methods are available to predict PDZ-peptide interactions but they often provide domain specific models and/or have a limited domain coverage.

RESULTS

Here, we composed the largest set of PDZ domains derived from human, mouse, fly and worm proteomes and defined binding models for PDZ domain families to improve the domain coverage and prediction specificity. For that purpose, we first identified a novel set of 138 PDZ families, comprising of 548 PDZ domains from aforementioned organisms, based on efficient clustering according to their sequence identity. For 43 PDZ families, covering 226 PDZ domains with available interaction data, we built specialized models using a support vector machine approach. The advantage of family-wise models is that they can also be used to determine the binding specificity of a newly characterized PDZ domain with sufficient sequence identity to the known families. Since most current experimental approaches provide only positive data, we have to cope with the class imbalance problem. Thus, to enrich the negative class, we introduced a powerful semi-supervised technique to generate high confidence non-interaction data. We report competitive predictive performance with respect to state-of-the-art approaches.

CONCLUSIONS

Our approach has several contributions. First, we show that domain coverage can be increased by applying accurate clustering technique. Second, we developed an approach based on a semi-supervised strategy to get high confidence negative data. Third, we allowed high order correlations between the amino acid positions in the binding peptides. Fourth, our method is general enough and will easily be applicable to other peptide recognition modules such as SH2 domains and finally, we performed a genome-wide prediction for 101 human and 102 mouse PDZ domains and uncovered novel interactions with biological relevance. We make all the predictive models and genome-wide predictions freely available to the scientific community.

摘要

背景

PDZ 结构域是最混杂的蛋白质识别模块之一，它与短线性肽结合，在细胞信号转导中发挥重要作用。最近，一些高通量技术（如蛋白质微阵列筛选、噬菌体展示）已被应用于确定 PDZ 结构域的体外结合特异性。目前，有许多计算方法可用于预测 PDZ-肽相互作用，但它们通常提供特定于结构域的模型和/或具有有限的结构域覆盖范围。

结果

在这里，我们构建了最大的一组来自人类、小鼠、果蝇和线虫蛋白质组的 PDZ 结构域，并为 PDZ 结构域家族定义了结合模型，以提高结构域覆盖范围和预测特异性。为此，我们首先根据序列同一性进行有效聚类，从上述生物体中识别出一组新的 138 个 PDZ 家族，包括 548 个 PDZ 结构域。对于 43 个 PDZ 家族，涵盖了 226 个具有可用相互作用数据的 PDZ 结构域，我们使用支持向量机方法构建了专门的模型。家族模型的优势在于，它们还可用于确定与已知家族具有足够序列同一性的新表征 PDZ 结构域的结合特异性。由于目前大多数实验方法仅提供阳性数据，因此我们必须应对类别不平衡问题。因此，为了丰富阴性类别，我们引入了一种强大的半监督技术来生成高置信度的非相互作用数据。我们报告了相对于最先进方法的竞争预测性能。

结论

我们的方法有几个贡献。首先，我们表明通过应用精确的聚类技术可以增加结构域的覆盖范围。其次，我们开发了一种基于半监督策略的方法来获取高置信度的阴性数据。第三，我们允许结合肽中的氨基酸位置之间存在高阶相关性。第四，我们的方法足够通用，并且可以轻松应用于其他肽识别模块，如 SH2 结构域。最后，我们对 101 个人类和 102 个小鼠 PDZ 结构域进行了全基因组预测，并发现了具有生物学相关性的新相互作用。我们将所有预测模型和全基因组预测免费提供给科学界。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e63f/4046824/bfaf601b9b4b/12864_2014_5678_Fig1_HTML.jpg

相似文献

Cluster based prediction of PDZ-peptide interactions.

BMC Genomics. 2014;15 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2164-15-S1-S5. Epub 2014 Jan 24.

Proteome scanning to predict PDZ domain interactions using support vector machines.

BMC Bioinformatics. 2010 Oct 12;11:507. doi: 10.1186/1471-2105-11-507.

Semi-supervised prediction of SH2-peptide interactions from imbalanced high-throughput data.

PLoS One. 2013 May 17;8(5):e62732. doi: 10.1371/journal.pone.0062732. Print 2013.

Predicting PDZ domain mediated protein interactions from structure.

BMC Bioinformatics. 2013 Jan 21;14:27. doi: 10.1186/1471-2105-14-27.

A genome-wide study of PDZ-domain interactions in C. elegans reveals a high frequency of non-canonical binding.

BMC Genomics. 2010 Nov 26;11:671. doi: 10.1186/1471-2164-11-671.

A specificity map for the PDZ domain family.

PLoS Biol. 2008 Sep 30;6(9):e239. doi: 10.1371/journal.pbio.0060239.

Predicting PDZ domain-peptide interactions from primary sequences.

Nat Biotechnol. 2008 Sep;26(9):1041-5. doi: 10.1038/nbt.1489.

Peptide binding properties of the three PDZ domains of Bazooka (Drosophila Par-3).

PLoS One. 2014 Jan 22;9(1):e86412. doi: 10.1371/journal.pone.0086412. eCollection 2014.

PDZ domain binding selectivity is optimized across the mouse proteome.

Science. 2007 Jul 20;317(5836):364-9. doi: 10.1126/science.1144592.

Structure-based prediction of the peptide sequence space recognized by natural and synthetic PDZ domains.

J Mol Biol. 2010 Sep 17;402(2):460-74. doi: 10.1016/j.jmb.2010.07.032. Epub 2010 Jul 21.

引用本文的文献

The order of PDZ3 and TrpCage in fusion chimeras determines their properties-a biophysical characterization.

Protein Sci. 2021 Aug;30(8):1653-1666. doi: 10.1002/pro.4107. Epub 2021 Jun 3.

Gpr63 is a modifier of microcephaly in Ttc21b mouse mutants.

PLoS Genet. 2019 Nov 15;15(11):e1008467. doi: 10.1371/journal.pgen.1008467. eCollection 2019 Nov.

MotifAnalyzer-PDZ: A computational program to investigate the evolution of PDZ-binding target specificity.

Protein Sci. 2019 Dec;28(12):2127-2143. doi: 10.1002/pro.3741. Epub 2019 Nov 1.

Functional and structural analysis of rare SLC2A2 variants associated with Fanconi-Bickel syndrome and metabolic traits.

Hum Mutat. 2019 Jul;40(7):983-995. doi: 10.1002/humu.23758. Epub 2019 Apr 25.

Magi-1 scaffolds Na1.8 and Slack K channels in dorsal root ganglion neurons regulating excitability and pain.

FASEB J. 2019 Jun;33(6):7315-7330. doi: 10.1096/fj.201802454RR. Epub 2019 Mar 12.

LMDIPred: A web-server for prediction of linear peptide sequences binding to SH3, WW and PDZ domains.

PLoS One. 2018 Jul 12;13(7):e0200430. doi: 10.1371/journal.pone.0200430. eCollection 2018.

Freiburg RNA tools: a central online resource for RNA-focused research and teaching.

Nucleic Acids Res. 2018 Jul 2;46(W1):W25-W29. doi: 10.1093/nar/gky329.

modPDZpep: a web resource for structure based analysis of human PDZ-mediated interaction networks.

Biol Direct. 2016 Sep 21;11(1):48. doi: 10.1186/s13062-016-0151-4.

MoDPepInt: an interactive web server for prediction of modular domain-peptide interactions.

Bioinformatics. 2014 Sep 15;30(18):2668-9. doi: 10.1093/bioinformatics/btu350. Epub 2014 May 28.

本文引用的文献

A graph kernel approach for alignment-free domain-peptide interaction prediction with an application to human SH3 domains.

Bioinformatics. 2013 Jul 1;29(13):i335-43. doi: 10.1093/bioinformatics/btt220.

Semi-supervised prediction of SH2-peptide interactions from imbalanced high-throughput data.

PLoS One. 2013 May 17;8(5):e62732. doi: 10.1371/journal.pone.0062732. Print 2013.

Predicting PDZ domain mediated protein interactions from structure.

BMC Bioinformatics. 2013 Jan 21;14:27. doi: 10.1186/1471-2105-14-27.

Plasticity of PDZ domains in ligand recognition and signaling.

FEBS Lett. 2012 Aug 14;586(17):2638-47. doi: 10.1016/j.febslet.2012.04.015. Epub 2012 Apr 21.

Domain-mediated protein interaction prediction: From genome to network.

FEBS Lett. 2012 Aug 14;586(17):2751-63. doi: 10.1016/j.febslet.2012.04.027. Epub 2012 May 3.

DomPep--a general method for predicting modular domain-mediated protein-protein interactions.

PLoS One. 2011;6(10):e25528. doi: 10.1371/journal.pone.0025528. Epub 2011 Oct 7.

Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy.

Nucleic Acids Res. 2011 Jul;39(Web Server issue):W475-8. doi: 10.1093/nar/gkr201. Epub 2011 Apr 5.

UniProt Knowledgebase: a hub of integrated protein data.

Database (Oxford). 2011 Mar 29;2011:bar009. doi: 10.1093/database/bar009. Print 2011.

A regression framework incorporating quantitative and negative interaction data improves quantitative prediction of PDZ domain-peptide interaction from primary sequence.

Bioinformatics. 2011 Feb 1;27(3):383-90. doi: 10.1093/bioinformatics/btq657. Epub 2010 Dec 2.

Proteome scanning to predict PDZ domain interactions using support vector machines.

BMC Bioinformatics. 2010 Oct 12;11:507. doi: 10.1186/1471-2105-11-507.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于聚类的 PDZ 肽相互作用预测。

Cluster based prediction of PDZ-peptide interactions.

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献