Zheng Lili, Allen Genevera I
Department of Electrical and Computer Engineering, Rice University.
Department of Computer Science, Rice University.
J Am Stat Assoc. 2024;119(547):2282-2293. doi: 10.1080/01621459.2023.2256503. Epub 2023 Oct 20.
In this paper, we investigate the Gaussian graphical model inference problem in a novel setting that we call measurements, referring to irregularly measured or observed data. For graphs, this results in different node pairs having vastly different sample sizes which frequently arises in data integration, genomics, neuroscience, and sensor networks. Existing works characterize the graph selection performance using the minimum pairwise sample size, which provides little insights for erosely measured data, and no existing inference method is applicable. We aim to fill in this gap by proposing the first inference method that characterizes the different uncertainty levels over the graph caused by the erose measurements, named GI-JOE (raph nference when oint bservations are rose). Specifically, we develop an edge-wise inference method and an affiliated FDR control procedure, where the variance of each edge depends on the sample sizes associated with corresponding neighbors. We prove statistical validity under erose measurements, thanks to careful localized edge-wise analysis and disentangling the dependencies across the graph. Finally, through simulation studies and a real neuroscience data example, we demonstrate the advantages of our inference methods for graph selection from erosely measured data.
在本文中,我们在一种新颖的环境下研究高斯图形模型推理问题,我们将这种环境称为测量,它指的是不规则测量或观测的数据。对于图形而言,这会导致不同的节点对具有差异极大的样本量,这种情况在数据整合、基因组学、神经科学和传感器网络中经常出现。现有工作使用最小成对样本量来表征图形选择性能,这对于测量不精确的数据几乎没有提供任何见解,并且不存在适用的现有推理方法。我们旨在通过提出第一种推理方法来填补这一空白,该方法能够表征由测量不精确所导致的图形上不同的不确定性水平,名为GI-JOE(观测不精确时的图形推理)。具体而言,我们开发了一种逐边推理方法以及一个相关的错误发现率(FDR)控制程序,其中每条边的方差取决于与相应邻居相关联的样本量。由于进行了仔细的局部逐边分析并理清了整个图形中的依赖关系,我们证明了在测量不精确情况下的统计有效性。最后,通过模拟研究和一个真实的神经科学数据示例,我们展示了我们的推理方法在从测量不精确的数据中进行图形选择方面的优势。