Lei Jing, Lin Kevin Z
Department of Statistics and Data Science, Carnegie Mellon University, USA.
Department of Statistics, Wharton School of Business, University of Pennsylvania, USA.
J Am Stat Assoc. 2023;118(544):2433-2445. doi: 10.1080/01621459.2022.2054817. Epub 2022 Apr 25.
We consider the problem of estimating common community structures in multi-layer stochastic block models, where each single layer may not have sufficient signal strength to recover the full community structure. In order to efficiently aggregate signal across different layers, we argue that the sum-of-squared adjacency matrices contain sufficient signal even when individual layers are very sparse. Our method uses a bias-removal step that is necessary when the squared noise matrices may overwhelm the signal in the very sparse regime. The analysis of our method relies on several novel tail probability bounds for matrix linear combinations with matrix-valued coefficients and matrix-valued quadratic forms, which may be of independent interest. The performance of our method and the necessity of bias removal is demonstrated in synthetic data and in microarray analysis about gene co-expression networks.
我们考虑在多层随机块模型中估计共同社区结构的问题,其中每一层单独的信号强度可能不足以恢复完整的社区结构。为了有效地聚合不同层之间的信号,我们认为即使各层非常稀疏,邻接矩阵的平方和也包含足够的信号。我们的方法使用了一个偏差消除步骤,这在非常稀疏的情况下,平方噪声矩阵可能会掩盖信号时是必要的。我们方法的分析依赖于几个关于具有矩阵值系数和矩阵值二次型的矩阵线性组合的新颖尾部概率界,这些界可能具有独立的研究价值。我们方法的性能以及偏差消除的必要性在合成数据和关于基因共表达网络的微阵列分析中得到了证明。