机器学习模型预测儿童铅中毒的验证。

Validation of a Machine Learning Model to Predict Childhood Lead Poisoning.

机构信息

Harris School of Public Policy, University of Chicago, Chicago, Illinois.

Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania.

出版信息

JAMA Netw Open. 2020 Sep 1;3(9):e2012734. doi: 10.1001/jamanetworkopen.2020.12734.

DOI:10.1001/jamanetworkopen.2020.12734

PMID:32936296

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7495240/

Abstract

IMPORTANCE

Childhood lead poisoning causes irreversible neurobehavioral deficits, but current practice is secondary prevention.

OBJECTIVE

To validate a machine learning (random forest) prediction model of elevated blood lead levels (EBLLs) by comparison with a parsimonious logistic regression.

DESIGN, SETTING, AND PARTICIPANTS: This prognostic study for temporal validation of multivariable prediction models used data from the Women, Infants, and Children (WIC) program of the Chicago Department of Public Health. Participants included a development cohort of children born from January 1, 2007, to December 31, 2012, and a validation WIC cohort born from January 1 to December 31, 2013. Blood lead levels were measured until December 31, 2018. Data were analyzed from January 1 to October 31, 2019.

EXPOSURES

Blood lead level test results; lead investigation findings; housing characteristics, permits, and violations; and demographic variables.

MAIN OUTCOMES AND MEASURES

Incident EBLL (≥6 μg/dL). Models were assessed using the area under the receiver operating characteristic curve (AUC) and confusion matrix metrics (positive predictive value, sensitivity, and specificity) at various thresholds.

RESULTS

Among 6812 children in the WIC validation cohort, 3451 (50.7%) were female, 3057 (44.9%) were Hispanic, 2804 (41.2%) were non-Hispanic Black, 458 (6.7%) were non-Hispanic White, and 442 (6.5%) were Asian (mean [SD] age, 5.5 [0.3] years). The median year of housing construction was 1919 (interquartile range, 1903-1948). Random forest AUC was 0.69 compared with 0.64 for logistic regression (difference, 0.05; 95% CI, 0.02-0.08). When predicting the 5% of children at highest risk to have EBLLs, random forest and logistic regression models had positive predictive values of 15.5% and 7.8%, respectively (difference, 7.7%; 95% CI, 3.7%-11.3%), sensitivity of 16.2% and 8.1%, respectively (difference, 8.1%; 95% CI, 3.9%-11.7%), and specificity of 95.5% and 95.1% (difference, 0.4%; 95% CI, 0.0%-0.7%).

CONCLUSIONS AND RELEVANCE

The machine learning model outperformed regression in predicting childhood lead poisoning, especially in identifying children at highest risk. Such a model could be used to target the allocation of lead poisoning prevention resources to these children.

摘要

重要性

儿童铅中毒会导致不可逆转的神经行为缺陷，但目前的做法是二级预防。

目的

通过与简约逻辑回归进行比较，验证一种用于预测血铅水平升高（EBLL）的机器学习（随机森林）预测模型。

设计、地点和参与者：本研究为多变量预测模型的时间验证研究，使用了芝加哥公共卫生部妇女、婴儿和儿童（WIC）计划的数据。参与者包括一个 2007 年 1 月 1 日至 12 月 31 日出生的发育队列儿童和一个 2013 年 1 月 1 日至 12 月 31 日出生的验证 WIC 队列。血铅水平一直测量到 2018 年 12 月 31 日。数据分析时间为 2019 年 1 月 1 日至 10 月 31 日。

暴露情况

血铅检测结果；铅调查结果；住房特征、许可证和违规行为；以及人口统计学变量。

主要结果和措施

新发 EBLL（≥6μg/dL）。使用受试者工作特征曲线（ROC）下面积（AUC）和混淆矩阵指标（阳性预测值、敏感性和特异性）在不同阈值下评估模型。

结果

在 WIC 验证队列的 6812 名儿童中，3451 名（50.7%）为女性，3057 名（44.9%）为西班牙裔，2804 名（41.2%）为非西班牙裔黑人，458 名（6.7%）为非西班牙裔白人，442 名（6.5%）为亚裔（平均[标准差]年龄为 5.5[0.3]岁）。房屋建筑的中位数年份为 1919 年（四分位间距，1903-1948）。随机森林 AUC 为 0.69，而逻辑回归为 0.64（差异，0.05；95%置信区间，0.02-0.08）。当预测 EBLL 风险最高的 5%儿童时，随机森林和逻辑回归模型的阳性预测值分别为 15.5%和 7.8%（差异，7.7%；95%置信区间，3.7%-11.3%），敏感性分别为 16.2%和 8.1%（差异，8.1%；95%置信区间，3.9%-11.7%），特异性分别为 95.5%和 95.1%（差异，0.4%；95%置信区间，0.0%-0.7%）。

结论和相关性

机器学习模型在预测儿童铅中毒方面优于回归，尤其是在识别高风险儿童方面。这样的模型可以用于将铅中毒预防资源分配给这些儿童。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

机器学习模型预测儿童铅中毒的验证。

Validation of a Machine Learning Model to Predict Childhood Lead Poisoning.

机构信息

出版信息

IMPORTANCE

OBJECTIVE

EXPOSURES

MAIN OUTCOMES AND MEASURES

RESULTS

CONCLUSIONS AND RELEVANCE

重要性

目的

暴露情况

主要结果和措施

结果

结论和相关性

相似文献

引用本文的文献

本文引用的文献

机器学习模型预测儿童铅中毒的验证。

Validation of a Machine Learning Model to Predict Childhood Lead Poisoning.

机构信息

出版信息

IMPORTANCE

OBJECTIVE

EXPOSURES

MAIN OUTCOMES AND MEASURES

RESULTS

CONCLUSIONS AND RELEVANCE

重要性

目的

暴露情况

主要结果和措施

结果

结论和相关性

相似文献

引用本文的文献

本文引用的文献