Suppr超能文献

利用机器学习估算《国家健康访谈调查》中移民的法律地位。

Using machine learning to impute legal status of immigrants in the National Health Interview Survey.

作者信息

Ruhnke Simon A, Wilson Fernando A, Stimpson Jim P

机构信息

Berliner Institut für empirische Integrations- und Migrationsforschung/BIM, Berlin, Germany.

University of Utah, Matheson Center for Health Care Studies, Salt Lake City, UT.

出版信息

MethodsX. 2022 Sep 8;9:101848. doi: 10.1016/j.mex.2022.101848. eCollection 2022.

Abstract

We describe a novel machine learning method of imputing legal status for immigrants using nationally representative survey data from the Survey of Income and Program Participation (SIPP) and the National Health Interview Survey (NHIS). K-nearest Neighbor (KNN) classifier and Random Forest (RF) Algorithm machine learning were described as novel imputation methods compared to established regression-based imputation. After validating the imputation methods using sensitivity, specificity, positive predictive value (PPV) and accuracy statistics, the Random Forest Algorithm was more accurate in identifying undocumented immigrants and minimized bias in both socio-demographic variables included in the imputation, and unobserved health variables relative to regression-based imputation and KNN.•We developed a new machine learning method of imputing legal status for immigrants that can be used with nationally representative, publicly available data.•Our findings indicate that using machine learning to impute legal status of immigrants, specifically the Random Forest Algorithm, was more accurate in identifying undocumented immigrants and minimized bias relative to other imputation methods.

摘要

我们描述了一种新颖的机器学习方法,该方法利用来自收入与项目参与调查(SIPP)和国家健康访谈调查(NHIS)的具有全国代表性的调查数据,来估算移民的法律身份。与既定的基于回归的插补方法相比,K近邻(KNN)分类器和随机森林(RF)算法机器学习被描述为新颖的插补方法。在使用敏感性、特异性、阳性预测值(PPV)和准确性统计数据对插补方法进行验证之后,随机森林算法在识别无证移民方面更为准确,并且相对于基于回归的插补和KNN,在插补中所包含的社会人口统计学变量以及未观察到的健康变量方面,将偏差降至最低。

•我们开发了一种用于估算移民法律身份的新机器学习方法,该方法可用于具有全国代表性的公开可用数据。

•我们的研究结果表明,使用机器学习来估算移民的法律身份,特别是随机森林算法,在识别无证移民方面更为准确,并且相对于其他插补方法,将偏差降至最低。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验