Embar Varsha, Handen Adam, Ganapathiraju Madhavi K
* Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
† Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15213, USA.
J Bioinform Comput Biol. 2016 Dec;14(6):1660002. doi: 10.1142/S0219720016600027.
When a set of genes are identified to be related to a disease, say through gene expression analysis, it is common to examine the average distance among their protein products in the human interactome as a measure of biological relatedness of these genes. The reasoning for this is that, genes associated with a disease would tend to be functionally related, and that functionally related genes would be closely connected to each other in the interactome. Typically, average shortest path length (ASPL) of disease genes (although referred to as genes in the context of disease-associations, the interactions are among protein-products of these genes) is compared to ASPL of randomly selected genes or to ASPL in a randomly permuted network. We examined whether the ASPL of a set of genes is indeed a good measure of biological relatedness or whether it is simply a characteristic of the degree distribution of those genes. We examined the ASPL of genes sets of some disease and pathway associations and compared them to ASPL of three types of randomly selected control sets: uniform selection, from entire proteome, degree-matched selection, and random permutation of the network. We found that disease associated genes and their degree-matched random genes have comparable ASPL. In other words, ASPL is a characteristic of the degree of the genes and the network topology, and not that of functional coherence.
当通过基因表达分析等方法确定一组基因与某种疾病相关时,通常会检查它们在人类相互作用组中的蛋白质产物之间的平均距离,以此作为这些基因生物学相关性的一种衡量指标。这样做的理由是,与疾病相关的基因往往在功能上相互关联,而在相互作用组中功能相关的基因会彼此紧密相连。通常,会将疾病相关基因(尽管在疾病关联的背景下被称为基因,但相互作用是在这些基因的蛋白质产物之间)的平均最短路径长度(ASPL)与随机选择的基因的ASPL或随机重排网络中的ASPL进行比较。我们研究了一组基因的ASPL是否确实是生物学相关性的良好衡量指标,或者它是否仅仅是这些基因度分布的一个特征。我们研究了一些疾病和通路关联的基因集的ASPL,并将它们与三种随机选择的对照组的ASPL进行比较:从整个蛋白质组中进行均匀选择、度匹配选择以及网络的随机重排。我们发现,与疾病相关的基因及其度匹配的随机基因具有相当的ASPL。换句话说,ASPL是基因度和网络拓扑结构的一个特征,而不是功能一致性的特征。