利用机器学习估算《国家健康访谈调查》中移民的法律地位。

Using machine learning to impute legal status of immigrants in the National Health Interview Survey.

作者信息

Ruhnke Simon A, Wilson Fernando A, Stimpson Jim P

机构信息

Berliner Institut für empirische Integrations- und Migrationsforschung/BIM, Berlin, Germany.

University of Utah, Matheson Center for Health Care Studies, Salt Lake City, UT.

出版信息

MethodsX. 2022 Sep 8;9:101848. doi: 10.1016/j.mex.2022.101848. eCollection 2022.

DOI:10.1016/j.mex.2022.101848

PMID:36160111

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9490167/

Abstract

We describe a novel machine learning method of imputing legal status for immigrants using nationally representative survey data from the Survey of Income and Program Participation (SIPP) and the National Health Interview Survey (NHIS). K-nearest Neighbor (KNN) classifier and Random Forest (RF) Algorithm machine learning were described as novel imputation methods compared to established regression-based imputation. After validating the imputation methods using sensitivity, specificity, positive predictive value (PPV) and accuracy statistics, the Random Forest Algorithm was more accurate in identifying undocumented immigrants and minimized bias in both socio-demographic variables included in the imputation, and unobserved health variables relative to regression-based imputation and KNN.•We developed a new machine learning method of imputing legal status for immigrants that can be used with nationally representative, publicly available data.•Our findings indicate that using machine learning to impute legal status of immigrants, specifically the Random Forest Algorithm, was more accurate in identifying undocumented immigrants and minimized bias relative to other imputation methods.

摘要

我们描述了一种新颖的机器学习方法，该方法利用来自收入与项目参与调查（SIPP）和国家健康访谈调查（NHIS）的具有全国代表性的调查数据，来估算移民的法律身份。与既定的基于回归的插补方法相比，K近邻（KNN）分类器和随机森林（RF）算法机器学习被描述为新颖的插补方法。在使用敏感性、特异性、阳性预测值（PPV）和准确性统计数据对插补方法进行验证之后，随机森林算法在识别无证移民方面更为准确，并且相对于基于回归的插补和KNN，在插补中所包含的社会人口统计学变量以及未观察到的健康变量方面，将偏差降至最低。

•我们开发了一种用于估算移民法律身份的新机器学习方法，该方法可用于具有全国代表性的公开可用数据。

•我们的研究结果表明，使用机器学习来估算移民的法律身份，特别是随机森林算法，在识别无证移民方面更为准确，并且相对于其他插补方法，将偏差降至最低。

相似文献

Using machine learning to impute legal status of immigrants in the National Health Interview Survey.

MethodsX. 2022 Sep 8;9:101848. doi: 10.1016/j.mex.2022.101848. eCollection 2022.

A healthy migrant effect? Estimating health outcomes of the undocumented immigrant population in the United States using machine learning.

Soc Sci Med. 2022 Aug;307:115177. doi: 10.1016/j.socscimed.2022.115177. Epub 2022 Jun 30.

A Test of the Validity of Imputed Legal Immigration Status.

Demography. 2024 Apr 1;61(2):283-306. doi: 10.1215/00703370-11189687.

Comparison of Use of Health Care Services and Spending for Unauthorized Immigrants vs Authorized Immigrants or US Citizens Using a Machine Learning Model.

JAMA Netw Open. 2020 Dec 1;3(12):e2029230. doi: 10.1001/jamanetworkopen.2020.29230.

Can we spin straw into gold? An evaluation of immigrant legal status imputation approaches.

Demography. 2015 Feb;52(1):329-54. doi: 10.1007/s13524-014-0358-x.

Assessing and comparison of different machine learning methods in parent-offspring trios for genotype imputation.

J Theor Biol. 2016 Jun 21;399:148-58. doi: 10.1016/j.jtbi.2016.03.035. Epub 2016 Apr 2.

Legal Status, Time in the USA, and the Well-Being of Latinos in Los Angeles.

J Urban Health. 2017 Dec;94(6):764-775. doi: 10.1007/s11524-017-0197-3.

Imputation Match Bias in Immigrant Wage Convergence.

Demography. 2018 Aug;55(4):1475-1485. doi: 10.1007/s13524-018-0686-3.

Legal Status and Wage Disparities for Mexican Immigrants.

Soc Forces. 2010 Dec 1;89(2):491-513. doi: 10.1353/sof.2010.0082.

Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.

Am J Epidemiol. 2014 Mar 15;179(6):764-74. doi: 10.1093/aje/kwt312. Epub 2014 Jan 12.

本文引用的文献

Comparison of Use of Health Care Services and Spending for Unauthorized Immigrants vs Authorized Immigrants or US Citizens Using a Machine Learning Model.

JAMA Netw Open. 2020 Dec 1;3(12):e2029230. doi: 10.1001/jamanetworkopen.2020.29230.

The Association Between Legal Status and Poverty Among Immigrants: A Methodological Caution.

Demography. 2020 Dec;57(6):2327-2335. doi: 10.1007/s13524-020-00933-0.

Documenting legal status: a systematic review of measurement of undocumented status in health research.

Public Health Rev. 2017 Nov 29;38:26. doi: 10.1186/s40985-017-0073-4. eCollection 2017.

Can we spin straw into gold? An evaluation of immigrant legal status imputation approaches.

Demography. 2015 Feb;52(1):329-54. doi: 10.1007/s13524-014-0358-x.

Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group.

Stat Med. 1998 Oct 15;17(19):2265-81. doi: 10.1002/(sici)1097-0258(19981015)17:19<2265::aid-sim918>3.0.co;2-b.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用机器学习估算《国家健康访谈调查》中移民的法律地位。

Using machine learning to impute legal status of immigrants in the National Health Interview Survey.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献