Wu Jingli, Zhang Qi, Li Gaoshi
Guangxi Key Lab of Multi-Source Information Mining & Security, Guangxi Normal University, Guilin 541004, P. R. China.
Yimeng Executive Leadership Academy, Linyi 276000, P. R. China.
J Bioinform Comput Biol. 2022 Feb;20(1):2150031. doi: 10.1142/S0219720021500311. Epub 2021 Dec 3.
With the rapid development of deep sequencing technologies, a large amount of high-throughput data has been available for studying the carcinogenic mechanism at the molecular level. It has been widely accepted that the development and progression of cancer are regulated by modules/pathways rather than individual genes. The investigation of identifying cancer-related active modules has received an extensive attention. In this paper, we put forward an identification method ModFinder by integrating both biological networks and gene expression profiles. More concretely, a gene scoring function is devised by using the regression model with [Formula: see text]-step random walk kernel, and the genes are ranked according to both of their active scores and degrees in the PPI network. Then a greedy algorithm NSEA is introduced to find an active module with high score and strong connectivity. Experiments were performed on both simulated data and real biological one, i.e. breast cancer and cervical cancer. Compared with the previous methods SigMod, LEAN and RegMod, ModFinder shows competitive performance. It can successfully identify a well-connected module that contains a large proportion of cancer-related genes, including some well-known oncogenes or tumor suppressors enriched in cancer-related pathways.
随着深度测序技术的快速发展,大量高通量数据可用于在分子水平研究致癌机制。人们普遍认为,癌症的发生和发展是由模块/通路而非单个基因调控的。识别癌症相关活性模块的研究受到了广泛关注。在本文中,我们通过整合生物网络和基因表达谱提出了一种识别方法ModFinder。更具体地说,利用具有[公式:见原文]-步随机游走核的回归模型设计了一个基因评分函数,并根据基因在PPI网络中的活性得分和度对基因进行排序。然后引入贪婪算法NSEA来寻找具有高分和强连通性的活性模块。在模拟数据和真实生物数据(即乳腺癌和宫颈癌数据)上进行了实验。与先前的方法SigMod、LEAN和RegMod相比,ModFinder表现出了有竞争力的性能。它能够成功识别出一个连接良好的模块,该模块包含很大比例的癌症相关基因,包括一些富集在癌症相关通路中的著名癌基因或肿瘤抑制基因。