Nakariyakul Songyot, Liu Zhi-Ping, Chen Luonan
Key Laboratory of Systems Biology, SIBS-Novo Nordisk Translational Research Centre for PreDiabetes, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; Department of Electrical and Computer Engineering, Thammasat University, Khlong Luang, Pathumthani 12120, Thailand.
Biochim Biophys Acta. 2014 Jan;1844(1 Pt B):165-70. doi: 10.1016/j.bbapap.2013.04.008. Epub 2013 Apr 20.
The PDZ domain is one of the most ubiquitous protein domains that is involved in coordinating signaling complex formation and protein networking by reversibly interacting with multiple binding partners. It has been linked to many devastating diseases such as avian influenza, Fraser syndrome, Usher syndrome and Dejerine-Sottas neuropathy. Understanding the selectivity of PDZ domains can help elucidate how defects in PDZ proteins and their binding partners lead to human diseases. Since experimental methods to determine the interaction specificity of the PDZ domains are expensive and labor intensive, an accurate computational method is thus needed. Our developed support vector machine-based predictor using dipeptide composition is shown to qualitatively predict PDZ domain-peptide interaction with a high accuracy rate. Furthermore, since most of the dipeptide compositions are redundant and irrelevant, we propose a new hybrid feature selection technique to select only a subset of these compositions for interaction prediction. The experimental results show that only approximately 25% of dipeptide features are needed and that our method improves the prediction results significantly. The selected dipeptide features are also analyzed and shown to play important roles in specificity patterns of PDZ domains. Our method is based only on primary sequence information, and it can be used for the research of drug target and drug design in identifying PDZ domain-ligand interactions. This article is part of a Special Issue entitled: Computational Proteomics, Systems Biology & Clinical Implications. Guest Editor: Yudong Cai.
PDZ结构域是最普遍存在的蛋白质结构域之一,它通过与多个结合伙伴可逆地相互作用,参与协调信号复合物的形成和蛋白质网络。它与许多毁灭性疾病有关,如禽流感、弗雷泽综合征、尤塞综合征和德热里纳 - 索塔斯神经病变。了解PDZ结构域的选择性有助于阐明PDZ蛋白及其结合伙伴的缺陷如何导致人类疾病。由于确定PDZ结构域相互作用特异性的实验方法既昂贵又耗费人力,因此需要一种准确的计算方法。我们开发的基于支持向量机的预测器,利用二肽组成,被证明能够以高精度定性预测PDZ结构域与肽的相互作用。此外,由于大多数二肽组成是冗余和不相关的,我们提出了一种新的混合特征选择技术,仅选择这些组成的一个子集用于相互作用预测。实验结果表明,只需要大约25%的二肽特征,并且我们的方法显著改善了预测结果。所选的二肽特征也经过分析,显示在PDZ结构域的特异性模式中发挥重要作用。我们的方法仅基于一级序列信息,可用于药物靶点研究和药物设计中识别PDZ结构域 - 配体相互作用。本文是名为:计算蛋白质组学、系统生物学与临床意义的特刊的一部分。客座编辑:蔡宇东。