Xia Yin, Li Lexin
Fudan University and University of California at Berkeley.
Stat Sin. 2022;32:293-321. doi: 10.5705/ss.202019.0361.
Comparing two population means of network data is of paramount importance in a wide range of scientific applications. Numerous existing network inference solutions focus on global testing of entire networks, without comparing individual network links. The observed data often take the form of vectors or matrices, and the problem is formulated as comparing two covariance or precision matrices under a normal or matrix normal distribution. Moreover, many tests suffer from a limited power under a small sample size. In this article, we tackle the problem of network comparison, both global and simultaneous inferences, when the data come in a different format, i.e., in the form of a collection of symmetric matrices, each of which encodes the network structure of an individual subject. Such data format commonly arises in applications such as brain connectivity analysis and clinical genomics. We no longer require the underlying data to follow a normal distribution, but instead impose some moment conditions that are easily satisfied for numerous types of network data. Furthermore, we propose a power enhancement procedure, and show that it can control the false discovery, while it has the potential to substantially enhance the power of the test. We investigate the efficacy of our testing procedure through both an asymptotic analysis and a simulation study under a finite sample size. We further illustrate our method with examples of brain connectivity analysis.
在广泛的科学应用中,比较网络数据的两个总体均值至关重要。许多现有的网络推断解决方案侧重于对整个网络进行全局测试,而不比较单个网络链接。观测数据通常采用向量或矩阵的形式,问题被表述为在正态或矩阵正态分布下比较两个协方差矩阵或精度矩阵。此外,许多检验在小样本量下功效有限。在本文中,当数据以不同格式出现时,即作为对称矩阵的集合形式出现,其中每个矩阵编码个体受试者的网络结构,我们解决网络比较问题,包括全局推断和同时推断。这种数据格式常见于脑连接性分析和临床基因组学等应用中。我们不再要求基础数据服从正态分布,而是施加一些矩条件,这些条件对于多种类型的网络数据很容易满足。此外,我们提出了一种功效增强程序,并表明它可以控制错误发现率,同时有可能大幅提高检验的功效。我们通过渐近分析和有限样本量下的模拟研究来研究我们检验程序的功效。我们还用脑连接性分析的例子进一步说明了我们的方法。