Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, 611-0011, Japan.
Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, 300, Hsinchu, Taiwan.
BMC Bioinformatics. 2021 Mar 22;22(1):143. doi: 10.1186/s12859-021-04062-2.
Recently, many computational methods have been proposed to predict cancer genes. One typical kind of method is to find the differentially expressed genes between tumour and normal samples. However, there are also some genes, for example, 'dark' genes, that play important roles at the network level but are difficult to find by traditional differential gene expression analysis. In addition, network controllability methods, such as the minimum feedback vertex set (MFVS) method, have been used frequently in cancer gene prediction. However, the weights of vertices (or genes) are ignored in the traditional MFVS methods, leading to difficulty in finding the optimal solution because of the existence of many possible MFVSs.
Here, we introduce a novel method, called weighted MFVS (WMFVS), which integrates the gene differential expression value with MFVS to select the maximum-weighted MFVS from all possible MFVSs in a protein interaction network. Our experimental results show that WMFVS achieves better performance than using traditional bio-data or network-data analyses alone.
This method balances the advantage of differential gene expression analyses and network analyses, improves the low accuracy of differential gene expression analyses and decreases the instability of pure network analyses. Furthermore, WMFVS can be easily applied to various kinds of networks, providing a useful framework for data analysis and prediction.
最近,许多计算方法被提出用于预测癌症基因。一种典型的方法是在肿瘤和正常样本之间寻找差异表达的基因。然而,也有一些基因,例如“暗”基因,在网络水平上发挥着重要作用,但很难通过传统的差异基因表达分析找到。此外,网络可控性方法,如最小反馈顶点集 (MFVS) 方法,在癌症基因预测中经常被使用。然而,在传统的 MFVS 方法中,忽略了顶点(或基因)的权重,导致由于存在许多可能的 MFVS,难以找到最优解。
在这里,我们引入了一种新方法,称为加权 MFVS (WMFVS),它将基因差异表达值与 MFVS 集成在一起,从蛋白质相互作用网络中的所有可能的 MFVS 中选择最大权重的 MFVS。我们的实验结果表明,WMFVS 比单独使用传统的生物数据分析或网络数据分析具有更好的性能。
该方法平衡了差异基因表达分析和网络分析的优势,提高了差异基因表达分析的低准确性,并降低了纯网络分析的不稳定性。此外,WMFVS 可以很容易地应用于各种类型的网络,为数据分析和预测提供了一个有用的框架。