Shang Junliang, Wang Jing, Sun Yan, Li Feng, Liu Jin-Xing, Zhang Honghai
School of Computer Science, Qufu Normal University, Rizhao 276826, China.
College of Life Science, Qufu Normal University, Qufu 273165, China.
Bioinformatics. 2021 Sep 29;37(18):2920-2929. doi: 10.1093/bioinformatics/btab182.
For network-assisted analysis, which has become a popular method of data mining, network construction is a crucial task. Network construction relies on the accurate quantification of direct associations among variables. The existence of multiscale associations among variables presents several quantification challenges, especially when quantifying nonlinear direct interactions.
In this study, the multiscale part mutual information (MPMI), based on part mutual information (PMI) and nonlinear partial association (NPA), was developed for effectively quantifying nonlinear direct associations among variables in networks with multiscale associations. First, we defined the MPMI in theory and derived its five important properties. Second, an experiment in a three-node network was carried out to numerically estimate its quantification ability under two cases of strong associations. Third, experiments of the MPMI and comparisons with the PMI, NPA and conditional mutual information were performed on simulated datasets and on datasets from DREAM challenge project. Finally, the MPMI was applied to real datasets of glioblastoma and lung adenocarcinoma to validate its effectiveness. Results showed that the MPMI is an effective alternative measure for quantifying nonlinear direct associations in networks, especially those with multiscale associations.
The source code of MPMI is available online at https://github.com/CDMB-lab/MPMI.
Supplementary data are available at Bioinformatics online.
对于已成为一种流行数据挖掘方法的网络辅助分析而言,网络构建是一项关键任务。网络构建依赖于变量间直接关联的准确量化。变量间多尺度关联的存在带来了若干量化挑战,尤其是在量化非线性直接相互作用时。
在本研究中,基于部分互信息(PMI)和非线性偏关联(NPA)开发了多尺度部分互信息(MPMI),用于有效量化具有多尺度关联的网络中变量间的非线性直接关联。首先,我们在理论上定义了MPMI并推导了其五个重要属性。其次,在一个三节点网络中进行了实验,以数值方式估计其在两种强关联情况下的量化能力。第三,在模拟数据集和来自DREAM挑战项目的数据集上进行了MPMI实验,并与PMI、NPA和条件互信息进行了比较。最后,将MPMI应用于胶质母细胞瘤和肺腺癌的真实数据集以验证其有效性。结果表明,MPMI是量化网络中非线性直接关联的一种有效替代方法,尤其是对于那些具有多尺度关联的网络。
MPMI的源代码可在https://github.com/CDMB-lab/MPMI在线获取。
补充数据可在《生物信息学》在线获取。