Ghent University - imec, IDLab, Technologiepark 15, Ghent, 9052, Belgium.
Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium.
BMC Bioinformatics. 2019 Jan 15;20(1):27. doi: 10.1186/s12859-018-2483-9.
Graphlets are useful for bioinformatics network analysis. Based on the structure of Hočevar and Demšar's ORCA algorithm, we have created an orbit counting algorithm, named Jesse. This algorithm, like ORCA, uses equations to count the orbits, but unlike ORCA it can count graphlets of any order. To do so, it generates the required internal structures and equations automatically. Many more redundant equations are generated, however, and Jesse's running time is highly dependent on which of these equations are used. Therefore, this paper aims to investigate which equations are most efficient, and which factors have an effect on this efficiency.
With appropriate equation selection, Jesse's running time may be reduced by a factor of up to 2 in the best case, compared to using randomly selected equations. Which equations are most efficient depends on the density of the graph, but barely on the graph type. At low graph density, equations with terms in their right-hand side with few arguments are more efficient, whereas at high density, equations with terms with many arguments in the right-hand side are most efficient. At a density between 0.6 and 0.7, both types of equations are about equally efficient.
Our Jesse algorithm became up to a factor 2 more efficient, by automatically selecting the best equations based on graph density. It was adapted into a Cytoscape App that is freely available from the Cytoscape App Store to ease application by bioinformaticians.
图元在生物信息学网络分析中很有用。基于 Hočevar 和 Demšar 的 ORCA 算法的结构,我们创建了一个轨道计数算法,命名为 Jesse。该算法与 ORCA 一样,使用方程来计算轨道,但与 ORCA 不同的是,它可以计算任何阶数的图元。为此,它自动生成所需的内部结构和方程。然而,生成的冗余方程更多,并且 Jesse 的运行时间高度依赖于使用哪些方程。因此,本文旨在研究哪些方程效率最高,以及哪些因素对这种效率有影响。
通过适当的方程选择,与随机选择方程相比,Jesse 的运行时间在最佳情况下可能减少多达 2 倍。哪些方程效率最高取决于图的密度,但几乎与图的类型无关。在低图密度下,右侧带有较少参数的项的方程效率更高,而在高密度下,右侧带有许多参数的项的方程效率更高。在密度介于 0.6 和 0.7 之间时,这两种类型的方程效率大致相同。
我们的 Jesse 算法通过根据图密度自动选择最佳方程,效率提高了多达 2 倍。它已被改编为 Cytoscape App,并可从 Cytoscape App Store 免费获得,以方便生物信息学家使用。