Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
J R Soc Interface. 2012 Jul 7;9(72):1625-36. doi: 10.1098/rsif.2011.0585. Epub 2012 Feb 1.
If one gene regulates another, those two genes are likely to be involved in many of the same biological functions. Conversely, shared biological function may be suggestive of the existence and nature of a regulatory interaction. With this in mind, we develop a measure of functional similarity between genes based on annotations made to the Gene Ontology in which the magnitude of their functional relationship is also indicative of a regulatory relationship. In contrast to other measures that have previously been used to quantify the functional similarity between genes, our measure scales the strength of any shared functional annotation by the frequency of that function's appearance across the entire set of annotations. We apply our method to both Escherichia coli and Saccharomyces cerevisiae gene annotations and find that the strength of our scaled similarity measure is more predictive of known regulatory interactions than previously published measures of functional similarity. In addition, we observe that the strength of the scaled similarity measure is correlated with the structural importance of links in the known regulatory network. By contrast, other measures of functional similarity are not indicative of any structural importance in the regulatory network. We therefore conclude that adequately adjusting for the frequency of shared biological functions is important in the construction of a functional similarity measure aimed at elucidating the existence and nature of regulatory interactions. We also compare the performance of the scaled similarity with a high-throughput method for determining regulatory interactions from gene expression data and observe that the ontology-based approach identifies a different subset of regulatory interactions compared with the gene expression approach. We show that combining predictions from the scaled similarity with those from the reconstruction algorithm leads to a significant improvement in the accuracy of the reconstructed network.
如果一个基因调控另一个基因,那么这两个基因很可能参与许多相同的生物学功能。相反,共同的生物学功能可能表明存在和性质的监管互动。考虑到这一点,我们开发了一种基于基因本体论注释的基因功能相似性度量方法,其中功能关系的大小也表明了调控关系的存在。与之前用于量化基因之间功能相似性的其他度量方法不同,我们的度量方法通过该功能在整个注释集中出现的频率来缩放任何共享功能注释的强度。我们将我们的方法应用于大肠杆菌和酿酒酵母的基因注释,发现我们的缩放相似性度量的强度比以前发表的功能相似性度量更能预测已知的调控相互作用。此外,我们观察到,缩放相似性度量的强度与已知调控网络中链接的结构重要性相关。相比之下,其他功能相似性度量并不能说明调控网络中的任何结构重要性。因此,我们得出结论,在构建旨在阐明调控相互作用的存在和性质的功能相似性度量时,充分考虑共享生物学功能的频率是很重要的。我们还将基于本体的方法与从基因表达数据确定调控相互作用的高通量方法进行了比较,观察到基于本体的方法与基因表达方法相比,确定了不同的调控相互作用子集。我们表明,将缩放相似性的预测与重建算法的预测相结合,可以显著提高重建网络的准确性。