Shang Haixia, Liu Zhi-Ping
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):336-346. doi: 10.1109/TCBB.2019.2917190. Epub 2021 Feb 3.
The prevalence of diabetes mellitus has been increasing rapidly in recent years. Type 2 diabetes makes up about 90 percent cases of diabetes. The interacting mixed effects of genetics and environments build possible interpretable pathogenesis. Thus, finding the causal disease genes is crucial in its clinical diagnosis and medical treatment. Currently, network-based computational method becomes a powerful tool of systematically analyzing complex diseases, such as the identification of candidate disease genes from networks. In this paper, we propose a bioinformatics framework of prioritizing type 2 diabetes genes by leveraging the modified PageRank algorithm on bilayer biomolecular networks consisting an ensemble gene-gene regulatory network and an integrative protein-protein interaction network. We specifically weigh the networks by differential mutual information for measuring the context specificities between genes and between proteins by transcriptomic and proteomic datasets, respectively. After formulating the network into two components of known disease genes and the other normal healthy genes, we rank the diabetes genes and others by bringing the orders in the bilayer network via an improved PageRank algorithm. We conclude that these known disease genes achieve significantly higher ranks compared to these randomly-selected normal genes, and the ranks are robust and consistent in multiple validation scenarios. In functional analysis, these high-ranked genes are identified to perform relevant risks and dysfunctions of type 2 diabetes.
近年来,糖尿病的患病率一直在迅速上升。2型糖尿病约占糖尿病病例的90%。遗传因素和环境因素的相互作用产生了可能的可解释发病机制。因此,找到致病基因对其临床诊断和治疗至关重要。目前,基于网络的计算方法成为系统分析复杂疾病的有力工具,例如从网络中识别候选疾病基因。在本文中,我们提出了一个生物信息学框架,通过在由基因-基因调控网络和整合的蛋白质-蛋白质相互作用网络组成的双层生物分子网络上利用改进的PageRank算法,对2型糖尿病基因进行优先级排序。我们分别通过转录组学和蛋白质组学数据集,利用差异互信息对网络进行加权,以测量基因之间和蛋白质之间的上下文特异性。在将网络划分为已知疾病基因和其他正常健康基因两个部分后,我们通过改进的PageRank算法在双层网络中对糖尿病基因和其他基因进行排序。我们得出结论,与随机选择的正常基因相比,这些已知疾病基因的排名显著更高,并且在多种验证场景下排名稳健且一致。在功能分析中,这些高排名基因被确定为具有2型糖尿病的相关风险和功能障碍。