Yang Jian, Yang Tinghong, Wu Duzhi, Lin Limei, Yang Fan, Zhao Jing
Department of Mathematics, Logistical Engineering University, Chongqing, China.
Institute of Interdisciplinary Complex Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
BMC Syst Biol. 2017 Jan 31;11(1):12. doi: 10.1186/s12918-017-0398-0.
Physical and functional interplays between genes or proteins have important biological meaning for cellular functions. Some efforts have been made to construct weighted gene association meta-networks by integrating multiple biological resources, where the weight indicates the confidence of the interaction. However, it is found that these existing human gene association networks share only quite limited overlapped interactions, suggesting their incompleteness and noise.
Here we proposed a workflow to construct a weighted human gene association network using information of six existing networks, including two weighted specific PPI networks and four gene association meta-networks. We applied link prediction algorithm to predict possible missing links of the networks, cross-validation approach to refine each network and finally integrated the refined networks to get the final integrated network.
The common information among the refined networks increases notably, suggesting their higher reliability. Our final integrated network owns much more links than most of the original networks, meanwhile its links still keep high functional relevance. Being used as background network in a case study of disease gene prediction, the final integrated network presents good performance, implying its reliability and application significance. Our workflow could be insightful for integrating and refining existing gene association data.
基因或蛋白质之间的物理和功能相互作用对细胞功能具有重要的生物学意义。人们已做出一些努力,通过整合多种生物资源来构建加权基因关联元网络,其中权重表示相互作用的置信度。然而,发现这些现有的人类基因关联网络仅共享非常有限的重叠相互作用,这表明它们存在不完整性和噪声。
在此,我们提出了一种工作流程,利用六个现有网络的信息构建加权人类基因关联网络,这六个网络包括两个加权特定蛋白质 - 蛋白质相互作用(PPI)网络和四个基因关联元网络。我们应用链接预测算法来预测网络中可能缺失的链接,采用交叉验证方法来优化每个网络,最后整合优化后的网络以获得最终的整合网络。
优化后网络之间的共同信息显著增加,表明其可靠性更高。我们最终的整合网络比大多数原始网络拥有更多的链接,同时其链接仍保持高度的功能相关性。在疾病基因预测的案例研究中用作背景网络时,最终的整合网络表现良好,这意味着其可靠性和应用意义。我们的工作流程对于整合和优化现有基因关联数据可能具有启发性。