基于投票集成方法的台湾边境食品安全风险预测EL V.2模型。

EL V.2 Model for Predicting Food Safety Risks at Taiwan Border Using the Voting-Based Ensemble Method.

作者信息

Wu Li-Ya, Liu Fang-Ming, Weng Sung-Shun, Lin Wen-Chou

机构信息

Food and Drug Administration, Ministry of Welfare, Taipei 115209, Taiwan.

Department of Information and Finance Management, National Taipei University of Technology, Taipei 10608, Taiwan.

出版信息

Foods. 2023 May 24;12(11):2118. doi: 10.3390/foods12112118.

DOI:10.3390/foods12112118

PMID:37297360

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10252765/

Abstract

Border management serves as a crucial control checkpoint for governments to regulate the quality and safety of imported food. In 2020, the first-generation ensemble learning prediction model (EL V.1) was introduced to Taiwan's border food management. This model primarily assesses the risk of imported food by combining five algorithms to determine whether quality sampling should be performed on imported food at the border. In this study, a second-generation ensemble learning prediction model (EL V.2) was developed based on seven algorithms to enhance the "detection rate of unqualified cases" and improve the robustness of the model. In this study, Elastic Net was used to select the characteristic risk factors. Two algorithms were used to construct the new model: The Bagging-Gradient Boosting Machine and Bagging-Elastic Net. In addition, F was used to flexibly control the sampling rate, improving the predictive performance and robustness of the model. The chi-square test was employed to compare the efficacy of "pre-launch (2019) random sampling inspection" and "post-launch (2020-2022) model prediction sampling inspection". For cases recommended for inspection by the ensemble learning model and subsequently inspected, the unqualified rates were 5.10%, 6.36%, and 4.39% in 2020, 2021, and 2022, respectively, which were significantly higher ( < 0.001) compared with the random sampling rate of 2.09% in 2019. The prediction indices established by the confusion matrix were used to further evaluate the prediction effects of EL V.1 and EL V.2, and the EL V.2 model exhibited superior predictive performance compared with EL V.1, and both models outperformed random sampling.

摘要

边境管理是政府监管进口食品质量和安全的关键控制点。2020年，第一代集成学习预测模型（EL V.1）被引入台湾边境食品管理。该模型主要通过结合五种算法来评估进口食品的风险，以确定是否应对边境进口食品进行质量抽检。在本研究中，基于七种算法开发了第二代集成学习预测模型（EL V.2），以提高“不合格案例检出率”并增强模型的稳健性。本研究使用弹性网络来选择特征风险因素。使用两种算法构建新模型：Bagging-梯度提升机和Bagging-弹性网络。此外，使用F来灵活控制抽样率，提高了模型的预测性能和稳健性。采用卡方检验比较“推出前（2019年）随机抽样检查”和“推出后（2020 - 2022年）模型预测抽样检查”的效果。对于集成学习模型推荐检查并随后进行检查的案例，2020年、2021年和2022年的不合格率分别为5.10%、6.36%和4.39%，与2019年2.09%的随机抽样率相比显著更高（<0.001）。利用混淆矩阵建立的预测指标进一步评估EL V.1和EL V.2的预测效果，与EL V.1相比，EL V.2模型表现出更好的预测性能，且两个模型均优于随机抽样。