Xie MaoQiang, Xu YingJie, Zhang YaoGong, Hwang TaeHyun, Kuang Rui
College of Software, Nankai University, Tianjin, China.
Department of Clinical Science, University of Texas Southwestern Medical Center, Dallas, TX, USA.
PLoS One. 2015 May 1;10(5):e0125138. doi: 10.1371/journal.pone.0125138. eCollection 2015.
The availability of ontologies and systematic documentations of phenotypes and their genetic associations has enabled large-scale network-based global analyses of the association between the complete collection of phenotypes (phenome) and genes. To provide a fundamental understanding of how the network information is relevant to phenotype-gene associations, we analyze the circular bigraphs (CBGs) in OMIM human disease phenotype-gene association network and MGI mouse phentoype-gene association network, and introduce a bi-random walk (BiRW) algorithm to capture the CBG patterns in the networks for unveiling human and mouse phenome-genome association. BiRW performs separate random walk simultaneously on gene interaction network and phenotype similarity network to explore gene paths and phenotype paths in CBGs of different sizes to summarize their associations as predictions.
The analysis of both OMIM and MGI associations revealed that majority of the phenotype-gene associations are covered by CBG patterns of small path lengths, and there is a clear correlation between the CBG coverage and the predictability of the phenotype-gene associations. In the experiments on recovering known associations in cross-validations on human disease phenotypes and mouse phenotypes, BiRW effectively improved prediction performance over the compared methods. The constructed global human disease phenome-genome association map also revealed interesting new predictions and phenotype-gene modules by disease classes.
本体论以及表型及其遗传关联的系统文档的可用性,使得基于网络的对完整表型集(表型组)与基因之间关联的大规模全局分析成为可能。为了从根本上理解网络信息如何与表型 - 基因关联相关,我们分析了在线人类孟德尔遗传(OMIM)人类疾病表型 - 基因关联网络和小鼠基因组信息学(MGI)小鼠表型 - 基因关联网络中的循环二分图(CBG),并引入了一种双随机游走(BiRW)算法来捕捉网络中的CBG模式,以揭示人类和小鼠的表型组 - 基因组关联。BiRW在基因相互作用网络和表型相似性网络上同时进行单独的随机游走,以探索不同大小的CBG中的基因路径和表型路径,将它们的关联总结为预测。
对OMIM和MGI关联的分析表明,大多数表型 - 基因关联被小路径长度的CBG模式所覆盖,并且CBG覆盖率与表型 - 基因关联的可预测性之间存在明显的相关性。在对人类疾病表型和小鼠表型进行交叉验证以恢复已知关联的实验中,BiRW相对于比较方法有效地提高了预测性能。构建的全球人类疾病表型组 - 基因组关联图谱还通过疾病类别揭示了有趣的新预测和表型 - 基因模块。