Fober Thomas, Mernberger Marco, Klebe Gerhard, Hüllermeier Eyke
Department of Mathematics and Computer Science, Philipps-Universität Marburg, 35032 Marburg, Germany.
The first two authors should be regarded as joint First Authors.
Mol Inform. 2012 Jul;31(6-7):443-52. doi: 10.1002/minf.201100149. Epub 2012 Jul 10.
A key task in structural biology is to define a meaningful similarity measure for the comparison of protein structures. Recently, the use of graphs as modeling tools for molecular data has gained increasing importance. In this context, kernel functions have attracted a lot of attention, especially since they allow for the application of a rich repertoire of methods from the field of kernel-based machine learning. However, most of the existing graph kernels have been designed for unlabeled and/or unweighted graphs, although proteins are often more naturally and more exactly represented in terms of node-labeled and edge-weighted graphs. Here we analyze kernel-based protein comparison methods and propose extensions to existing graph kernels to exploit node-labeled and edge-weighted graphs. Moreover, we propose an instance of the substructure fingerprint kernel suitable for the analysis of protein binding sites. By using fuzzy fingerprints, we solve the problem of discontinuity on bin-boundaries arising in the case of labeled graphs.
结构生物学中的一项关键任务是定义一种有意义的相似性度量,用于比较蛋白质结构。最近,将图用作分子数据的建模工具变得越来越重要。在这种背景下,核函数引起了广泛关注,特别是因为它们允许应用基于核的机器学习领域的大量方法。然而,尽管蛋白质通常更自然、更准确地用节点标记和边加权图来表示,但现有的大多数图核都是为无标记和/或无权图设计的。在这里,我们分析基于核的蛋白质比较方法,并对现有的图核提出扩展,以利用节点标记和边加权图。此外,我们提出了一种适用于分析蛋白质结合位点的子结构指纹核实例。通过使用模糊指纹,我们解决了在标记图情况下出现在箱边界上的不连续性问题。