Suppr超能文献

异质缺失情况下复杂抽样调查中的混合矩阵补全

Mixed Matrix Completion in Complex Survey Sampling under Heterogeneous Missingness.

作者信息

Mao Xiaojun, Wang Hengfang, Wang Zhonglei, Yang Shu

机构信息

School of Mathematical Sciences, Ministry of Education Key Laboratory of Scientific and Engineering Computing, Shanghai Jiao Tong University, Shanghai, 200240, China.

School of Mathematics and Statistics & Fujian Provincial Key Laboratory of Statistics and Artificial Intelligence, Fujian Normal University, Fujian 350007, China.

出版信息

J Comput Graph Stat. 2024;33(4):1320-1328. doi: 10.1080/10618600.2024.2319154. Epub 2024 Mar 29.

Abstract

Modern surveys with large sample sizes and growing mixed-type questionnaires require robust and scalable analysis methods. In this work, we consider recovering a mixed dataframe matrix, obtained by complex survey sampling, with entries following different canonical exponential distributions and subject to heterogeneous missingness. To tackle this challenging task, we propose a two-stage procedure: in the first stage, we model the entry-wise missing mechanism by logistic regression, and in the second stage, we complete the target parameter matrix by maximizing a weighted log-likelihood with a low-rank constraint. We propose a fast and scalable estimation algorithm that achieves sublinear convergence, and the upper bound for the estimation error of the proposed method is rigorously derived. Experimental results support our theoretical claims, and the proposed estimator shows its merits compared to other existing methods. The proposed method is applied to analyze the National Health and Nutrition Examination Survey data. Supplementary materialsfor this article are available online.

摘要

现代大规模抽样调查以及日益增多的混合型问卷需要强大且可扩展的分析方法。在这项工作中,我们考虑恢复一个通过复杂抽样调查获得的混合型数据框矩阵,其元素服从不同的标准指数分布且存在异质性缺失。为解决这一具有挑战性的任务,我们提出了一个两阶段程序:在第一阶段,我们通过逻辑回归对逐个元素的缺失机制进行建模;在第二阶段,我们通过最大化带有低秩约束的加权对数似然来完成目标参数矩阵。我们提出了一种实现次线性收敛的快速且可扩展的估计算法,并严格推导了所提方法估计误差的上界。实验结果支持了我们的理论主张,并且与其他现有方法相比,所提估计器展现出了其优势。所提方法被应用于分析美国国家健康与营养检查调查数据。本文的补充材料可在线获取。

相似文献

2
Matrix completion under complex survey sampling.复杂抽样调查下的矩阵补全
Ann Inst Stat Math. 2023 Jun;75(3):463-492. doi: 10.1007/s10463-022-00851-5. Epub 2022 Sep 19.
3
Noisy Tensor Completion via Low-Rank Tensor Ring.基于低秩张量环的噪声张量补全
IEEE Trans Neural Netw Learn Syst. 2022 Jun 17;PP. doi: 10.1109/TNNLS.2022.3181378.
4
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.非ignorable协变量缺失数据问题中的经验似然
Int J Biostat. 2017 Apr 20;13(1):/j/ijb.2017.13.issue-1/ijb-2016-0053/ijb-2016-0053.xml. doi: 10.1515/ijb-2016-0053.
5
Fast Robust Matrix Completion via Entry-Wise ℓ-Norm Minimization.通过逐元素ℓ范数最小化实现快速鲁棒矩阵补全
IEEE Trans Cybern. 2023 Nov;53(11):7199-7212. doi: 10.1109/TCYB.2022.3224070. Epub 2023 Oct 17.
7
High-dimensional principal component analysis with heterogeneous missingness.具有异质缺失值的高维主成分分析
J R Stat Soc Series B Stat Methodol. 2022 Nov;84(5):2000-2031. doi: 10.1111/rssb.12550. Epub 2022 Nov 20.
9
Structured Matrix Completion with Applications to Genomic Data Integration.结构化矩阵补全及其在基因组数据整合中的应用
J Am Stat Assoc. 2016;111(514):621-633. doi: 10.1080/01621459.2015.1021005. Epub 2016 Aug 18.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验