Suppr超能文献

用于正则树文法网络搜索的算法及其在挖掘人类-病毒感染模式中的应用

Algorithms for Regular Tree Grammar Network Search and Their Application to Mining Human-viral Infection Patterns.

作者信息

Smoly Ilan, Carmel Amir, Shemer-Avni Yonat, Yeger-Lotem Esti, Ziv-Ukelson Michal

机构信息

1 Department of Computer Science, Ben-Gurion University of the Negev , Beer-Sheva, Israel .

2 Department of Virology, Ben-Gurion University of the Negev , Beer-Sheva, Israel .

出版信息

J Comput Biol. 2016 Mar;23(3):165-79. doi: 10.1089/cmb.2015.0168.

Abstract

Network querying is a powerful approach to mine molecular interaction networks. Most state-of-the-art network querying tools either confine the search to a prespecified topology in the form of some template subnetwork, or do not specify any topological constraints at all. Another approach is grammar-based queries, which are more flexible and expressive as they allow for expressing the topology of the sought pattern according to some grammar-based logic. Previous grammar-based network querying tools were confined to the identification of paths. In this article, we extend the patterns identified by grammar-based query approaches from paths to trees. For this, we adopt a higher order query descriptor in the form of a regular tree grammar (RTG). We introduce a novel problem and propose an algorithm to search a given graph for the k highest scoring subgraphs matching a tree accepted by an RTG. Our algorithm is based on the combination of dynamic programming with color coding, and includes an extension of previous k-best parsing optimization approaches to avoid isomorphic trees in the output. We implement the new algorithm and exemplify its application to mining viral infection patterns within molecular interaction networks. Our code is available online.

摘要

网络查询是挖掘分子相互作用网络的一种强大方法。大多数最先进的网络查询工具要么将搜索限制在某种模板子网形式的预先指定拓扑中,要么根本不指定任何拓扑约束。另一种方法是基于语法的查询,它更加灵活且富有表现力,因为它们允许根据某种基于语法的逻辑来表达所寻找模式的拓扑结构。以前基于语法的网络查询工具仅限于路径的识别。在本文中,我们将基于语法的查询方法所识别的模式从路径扩展到树。为此,我们采用正则树文法(RTG)形式的高阶查询描述符。我们引入了一个新问题,并提出了一种算法,用于在给定图中搜索与RTG接受的树匹配的k个得分最高的子图。我们的算法基于动态规划与颜色编码的结合,并且包括对先前k最佳解析优化方法的扩展,以避免输出中的同构树。我们实现了新算法,并举例说明了其在挖掘分子相互作用网络中的病毒感染模式方面的应用。我们的代码可在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验