Rzhetsky Andrey, Wajngurt David, Park Naeun, Zheng Tian
Department of Biomedical Informatics, Center for Computational Biology and Bioinformatics and Joint Centers for Systems Biology, Columbia University, New York, NY 10032, USA.
Proc Natl Acad Sci U S A. 2007 Jul 10;104(28):11694-9. doi: 10.1073/pnas.0704820104. Epub 2007 Jul 3.
Geneticists and epidemiologists often observe that certain hereditary disorders cooccur in individual patients significantly more (or significantly less) frequently than expected, suggesting there is a genetic variation that predisposes its bearer to multiple disorders, or that protects against some disorders while predisposing to others. We suggest that, by using a large number of phenotypic observations about multiple disorders and an appropriate statistical model, we can infer genetic overlaps between phenotypes. Our proof-of-concept analysis of 1.5 million patient records and 161 disorders indicates that disease phenotypes form a highly connected network of strong pairwise correlations. Our modeling approach, under appropriate assumptions, allows us to estimate from these correlations the size of putative genetic overlaps. For example, we suggest that autism, bipolar disorder, and schizophrenia share significant genetic overlaps. Our disease network hypothesis can be immediately exploited in the design of genetic mapping approaches that involve joint linkage or association analyses of multiple seemingly disparate phenotypes.
遗传学家和流行病学家经常观察到,某些遗传性疾病在个体患者中同时出现的频率显著高于(或显著低于)预期,这表明存在一种基因变异,使携带者易患多种疾病,或者在易患某些疾病的同时预防其他疾病。我们认为,通过使用大量关于多种疾病的表型观察数据和适当的统计模型,我们可以推断出表型之间的基因重叠情况。我们对150万份患者记录和161种疾病进行的概念验证分析表明,疾病表型形成了一个由强成对相关性构成的高度连通网络。在适当的假设下,我们的建模方法使我们能够从这些相关性中估计假定基因重叠的大小。例如,我们认为自闭症、双相情感障碍和精神分裂症存在显著的基因重叠。我们的疾病网络假说可立即应用于遗传定位方法的设计中,这些方法涉及对多个看似不同的表型进行联合连锁或关联分析。