Wang Xiao-Dong, Huang Jia-Liang, Yang Lun, Wei Dong-Qing, Qi Ying-Xin, Jiang Zong-Lai
Institute of Mechanobiology and Medical Engineering, School of Life Sciences & Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
Bioinformatics, Integrated Platform Science, GlaxoSmithKline Research and Development China, Shanghai, China.
PLoS One. 2014 Jan 22;9(1):e86142. doi: 10.1371/journal.pone.0086142. eCollection 2014.
Identifying genes related to human diseases, such as cancer and cardiovascular disease, etc., is an important task in biomedical research because of its applications in disease diagnosis and treatment. Interactome networks, especially protein-protein interaction networks, had been used to disease genes identification based on the hypothesis that strong candidate genes tend to closely relate to each other in some kinds of measure on the network. We proposed a new measure to analyze the relationship between network nodes which was called graphlet interaction. The graphlet interaction contained 28 different isomers. The results showed that the numbers of the graphlet interaction isomers between disease genes in interactome networks were significantly larger than random picked genes, while graphlet signatures were not. Then, we designed a new type of score, based on the network properties, to identify disease genes using graphlet interaction. The genes with higher scores were more likely to be disease genes, and all candidate genes were ranked according to their scores. Then the approach was evaluated by leave-one-out cross-validation. The precision of the current approach achieved 90% at about 10% recall, which was apparently higher than the previous three predominant algorithms, random walk, Endeavour and neighborhood based method. Finally, the approach was applied to predict new disease genes related to 4 common diseases, most of which were identified by other independent experimental researches. In conclusion, we demonstrate that the graphlet interaction is an effective tool to analyze the network properties of disease genes, and the scores calculated by graphlet interaction is more precise in identifying disease genes.
识别与人类疾病相关的基因,如癌症和心血管疾病等,是生物医学研究中的一项重要任务,因为其在疾病诊断和治疗方面具有应用价值。相互作用组网络,尤其是蛋白质-蛋白质相互作用网络,已被用于疾病基因识别,其依据的假设是,在网络的某种度量下,强候选基因往往彼此紧密相关。我们提出了一种新的度量方法来分析网络节点之间的关系,称为图元相互作用。图元相互作用包含28种不同的异构体。结果表明,相互作用组网络中疾病基因之间的图元相互作用异构体数量显著多于随机选取的基因,而图元特征则不然。然后,我们基于网络属性设计了一种新型得分,以利用图元相互作用识别疾病基因。得分较高的基因更有可能是疾病基因,并根据得分对所有候选基因进行排序。然后通过留一法交叉验证对该方法进行评估。当前方法在召回率约为10%时精度达到90%,明显高于之前的三种主要算法,即随机游走、Endeavour和基于邻域的方法。最后,该方法被应用于预测与4种常见疾病相关的新疾病基因,其中大多数已被其他独立实验研究所识别。总之,我们证明图元相互作用是分析疾病基因网络属性的有效工具,并且通过图元相互作用计算的得分在识别疾病基因方面更精确。