Gawronski Alex R, Turcotte Marcel
BMC Bioinformatics. 2014;15 Suppl 13(Suppl 13):S2. doi: 10.1186/1471-2105-15-S13-S2. Epub 2014 Nov 13.
Frequent subgraph mining is a useful method for extracting meaningful patterns from a set of graphs or a single large graph. Here, the graph represents all possible RNA structures and interactions. Patterns that are significantly more frequent in this graph over a random graph are extracted. We hypothesize that these patterns are most likely to represent biological mechanisms. The graph representation used is a directed dual graph, extended to handle intermolecular interactions. The graph is sampled for subgraphs, which are labeled using a canonical labeling method and counted. The resulting patterns are compared to those created from a randomized dataset and scored. The algorithm was applied to the mitochondrial genome of the kinetoplastid species Trypanosoma brucei, which has a unique RNA editing mechanism. The most significant patterns contain two stem-loops, indicative of gRNA, and represent interactions of these structures with target mRNA.
频繁子图挖掘是一种从一组图或单个大图中提取有意义模式的有用方法。在此,图表示所有可能的RNA结构和相互作用。在该图中比随机图显著更频繁出现的模式被提取出来。我们假设这些模式最有可能代表生物学机制。所使用的图表示是一种有向对偶图,经过扩展以处理分子间相互作用。对图进行子图采样,使用规范标记方法对其进行标记并计数。将得到的模式与从随机数据集创建的模式进行比较并评分。该算法被应用于动基体物种布氏锥虫的线粒体基因组,该物种具有独特的RNA编辑机制。最显著的模式包含两个茎环,指示引导RNA(gRNA),并代表这些结构与靶mRNA的相互作用。