Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada.
Departments of Statistics and Biostatistics, University of Washington, Seattle, Washington, USA.
Biometrics. 2022 Sep;78(3):1018-1030. doi: 10.1111/biom.13464. Epub 2021 Apr 12.
In this paper, we consider data consisting of multiple networks, each composed of a different edge set on a common set of nodes. Many models have been proposed for the analysis of such multiview network data under the assumption that the data views are closely related. In this paper, we provide tools for evaluating this assumption. In particular, we ask: given two networks that each follow a stochastic block model, is there an association between the latent community memberships of the nodes in the two networks? To answer this question, we extend the stochastic block model for a single network view to the two-view setting, and develop a new hypothesis test for the null hypothesis that the latent community memberships in the two data views are independent. We apply our test to protein-protein interaction data from the HINT database. We find evidence of a weak association between the latent community memberships of proteins defined with respect to binary interaction data and the latent community memberships of proteins defined with respect to cocomplex association data. We also extend this proposal to the setting of a network with node covariates. The proposed methods extend readily to three or more network/multivariate data views.
在本文中,我们考虑由多个网络组成的数据,每个网络由公共节点集上的不同边集组成。在假设数据视图密切相关的情况下,已经提出了许多用于分析此类多视图网络数据的模型。在本文中,我们提供了评估此假设的工具。具体来说,我们要问:给定两个网络,每个网络都遵循随机块模型,那么两个网络中节点的潜在社区成员身份之间是否存在关联?为了回答这个问题,我们将单个网络视图的随机块模型扩展到了双视图设置,并为两个数据视图中的潜在社区成员身份独立的零假设开发了一个新的假设检验。我们将我们的测试应用于 HINT 数据库中的蛋白质 - 蛋白质相互作用数据。我们发现,根据二进制相互作用数据定义的蛋白质的潜在社区成员身份和根据共复合物关联数据定义的蛋白质的潜在社区成员身份之间存在弱关联的证据。我们还将此建议扩展到具有节点协变量的网络设置中。所提出的方法可以轻松扩展到三个或更多网络/多变量数据视图。