Hoi Steven C H, Li Zhenhua, Wong Limsoon, Nguyen Hung, Li Jinyan
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):7-16. doi: 10.1109/TCBB.2013.136.
Coupling graphs are newly introduced in this paper to meet many application needs particularly in the field of bioinformatics. A coupling graph is a two-layer graph complex, in which each node from one layer of the graph complex has at least one connection with the nodes in the other layer, and vice versa. The coupling graph model is sufficiently powerful to capture strong and inherent associations between subgraph pairs in complicated applications. The focus of this paper is on mining algorithms of frequent coupling subgraphs and bioinformatics application. Although existing frequent subgraph mining algorithms are competent to identify frequent subgraphs from a graph database, they perform poorly on frequent coupling subgraph mining because they generate many irrelevant subgraphs. We propose a novel graph transformation technique to transform a coupling graph into a generic graph. Based on the transformed coupling graphs, existing graph mining methods are then utilized to discover frequent coupling subgraphs. We prove that the transformation is precise and complete and that the restoration is reversible. Experiments carried out on a database containing 10,511 coupling graphs show that our proposed algorithm reduces the mining time very much in comparison with the existing subgraph mining algorithms. Moreover, we demonstrate the usefulness of frequent coupling subgraphs by applying our algorithm to make accurate predictions of epitopes in antibody-antigen binding.
本文新引入了耦合图,以满足多种应用需求,特别是在生物信息学领域。耦合图是一种两层图复合体,其中图复合体一层的每个节点与另一层的节点至少有一个连接,反之亦然。耦合图模型足够强大,能够捕捉复杂应用中子图对之间强烈的内在关联。本文重点关注频繁耦合子图的挖掘算法及其在生物信息学中的应用。尽管现有的频繁子图挖掘算法能够从图数据库中识别频繁子图,但它们在频繁耦合子图挖掘方面表现不佳,因为会生成许多不相关的子图。我们提出了一种新颖的图变换技术,将耦合图转换为通用图。基于变换后的耦合图,利用现有的图挖掘方法来发现频繁耦合子图。我们证明了这种变换是精确且完备的,并且恢复是可逆的。在包含10511个耦合图的数据库上进行的实验表明,与现有的子图挖掘算法相比,我们提出的算法大大减少了挖掘时间。此外,我们通过应用我们的算法对抗体 - 抗原结合中的表位进行准确预测,证明了频繁耦合子图的实用性。