Köhler Sebastian, Bauer Sebastian, Horn Denise, Robinson Peter N
Institute for Medical Genetics, Charité Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany.
Am J Hum Genet. 2008 Apr;82(4):949-58. doi: 10.1016/j.ajhg.2008.02.013. Epub 2008 Mar 27.
The identification of genes associated with hereditary disorders has contributed to improving medical care and to a better understanding of gene functions, interactions, and pathways. However, there are well over 1500 Mendelian disorders whose molecular basis remains unknown. At present, methods such as linkage analysis can identify the chromosomal region in which unknown disease genes are located, but the regions could contain up to hundreds of candidate genes. In this work, we present a method for prioritization of candidate genes by use of a global network distance measure, random walk analysis, for definition of similarities in protein-protein interaction networks. We tested our method on 110 disease-gene families with a total of 783 genes and achieved an area under the ROC curve of up to 98% on simulated linkage intervals of 100 genes surrounding the disease gene, significantly outperforming previous methods based on local distance measures. Our results not only provide an improved tool for positional-cloning projects but also add weight to the assumption that phenotypically similar diseases are associated with disturbances of subnetworks within the larger protein interactome that extend beyond the disease proteins themselves.
与遗传性疾病相关基因的鉴定有助于改善医疗护理,并增进对基因功能、相互作用及通路的理解。然而,有超过1500种孟德尔疾病的分子基础仍不为人知。目前,诸如连锁分析等方法能够确定未知疾病基因所在的染色体区域,但这些区域可能包含多达数百个候选基因。在这项研究中,我们提出了一种通过使用全局网络距离度量(随机游走分析)来对候选基因进行优先级排序的方法,以定义蛋白质 - 蛋白质相互作用网络中的相似性。我们在包含总共783个基因的110个疾病 - 基因家族上测试了我们的方法,在围绕疾病基因的100个基因的模拟连锁区间上,受试者工作特征曲线下面积高达98%,显著优于基于局部距离度量的先前方法。我们的结果不仅为定位克隆项目提供了一个改进的工具,还进一步支持了这样一种假设,即表型相似的疾病与更大的蛋白质相互作用组内子网的紊乱有关,这种紊乱超出了疾病蛋白质本身。