Department of Cell and Systems Biology, University of Toronto, Toronto, Canada M5S 3G5.
Bioinformatics. 2012 Apr 1;28(7):962-9. doi: 10.1093/bioinformatics/bts060. Epub 2012 Feb 1.
Protein kinases represent critical links in cell signaling. A central problem in computational biology is to systematically identify their substrates.
This study introduces a new method to predict kinase substrates by extracting evolutionary information from multiple sequence alignments in a manner that is tolerant to degenerate motif positioning. Given a known consensus, the new method (ConDens) compares the observed density of matches to a null model of evolution and does not require labeled training data. We confirmed that ConDens has improved performance compared with several existing methods in the field. Further, we show that it is generalizable and can predict interesting substrates for several important eukaryotic kinases where training data is not available.
ConDens can be found at http://www.moseslab.csb.utoronto.ca/andyl/.
Supplementary data are available at Bioinformatics online.
蛋白激酶是细胞信号转导的关键环节。计算生物学中的一个核心问题是系统地识别它们的底物。
本研究介绍了一种新的方法,通过从多重序列比对中提取进化信息来预测激酶底物,这种方法对模体定位的退化具有容忍性。给定一个已知的共识,新方法(ConDens)将观察到的匹配密度与进化的空模型进行比较,并且不需要标记的训练数据。我们证实,ConDens 的性能优于该领域的几个现有方法。此外,我们表明它是可推广的,可以预测几个重要的真核激酶的有趣底物,而这些激酶没有可用的训练数据。
ConDens 可在 http://www.moseslab.csb.utoronto.ca/andyl/ 找到。
补充数据可在 Bioinformatics 在线获得。