Cheng Cong, Ke Yuan, Zhang Wenyang
Department of Statistics, University of Georgia.
Faculty of Business Administration, University of Macau.
J Am Stat Assoc. 2025 Feb 10. doi: 10.1080/01621459.2024.2442092.
The estimation of large precision matrices is crucial in modern multivariate analysis. Traditional sparsity assumptions, while useful, often fall short of accurately capturing the dependencies among features. This paper addresses this limitation by focusing on precision matrix estimation for multivariate data characterized by a flexible yet unknown group structure. We introduce a novel approach that begins with the detection of this unknown group structure, clustering features within the low-dimensional space defined by the leading eigenvectors of the sample covariance matrix. Following this, we employ group-wise multivariate response linear regressions, guided by the identified group memberships, to estimate the precision matrix. We rigorously establish the theoretical foundations of our proposed method for both group detection and precision matrix estimation. The superior numerical performance of our approach is demonstrated through comprehensive simulation experiments and a comparative analysis with established methods in the field. Additionally, we apply our method to a real breast cancer dataset, showcasing its practical utility and effectiveness.
在现代多元分析中,大型精度矩阵的估计至关重要。传统的稀疏性假设虽然有用,但往往不足以准确捕捉特征之间的相关性。本文通过关注具有灵活但未知分组结构的多元数据的精度矩阵估计来解决这一局限性。我们引入了一种新颖的方法,该方法首先检测这种未知的分组结构,即在由样本协方差矩阵的主特征向量定义的低维空间内对特征进行聚类。在此之后,我们根据识别出的组成员关系,采用分组多元响应线性回归来估计精度矩阵。我们严格地为我们提出的用于分组检测和精度矩阵估计的方法建立了理论基础。通过全面的模拟实验以及与该领域现有方法的对比分析,证明了我们方法卓越的数值性能。此外,我们将我们的方法应用于一个真实的乳腺癌数据集,展示了其实际效用和有效性。