Department of Healthcare Administration, Eulji University, Seongnam 13135, Korea.
Department of Health Administration, Dankook University, Cheonan 31116, Korea.
Int J Environ Res Public Health. 2020 Jan 31;17(3):897. doi: 10.3390/ijerph17030897.
(1) Medical research has shown an increasing interest in machine learning, permitting massive multivariate data analysis. Thus, we developed drug intoxication mortality prediction models, and compared machine learning models and traditional logistic regression. (2) Categorized as drug intoxication, 8,937 samples were extracted from the Korea Centers for Disease Control and Prevention (2008-2017). We trained, validated, and tested each model through data and compared their performance using three measures: Brier score, calibration slope, and calibration-in-the-large. (3) A chi-square test demonstrated that mortality risk statistically significantly differed according to severity, intent, toxic substance, age, and sex. The multilayer perceptron model (MLP) had the highest area under the curve (AUC), and lowest Brier score in training and validation phases, while the logistic regression model (LR) showed the highest AUC (0.827) and lowest Brier score (0.0307) in the testing phase. MLP also had the second-highest AUC (0.816) and second-lowest Brier score (0.003258) in the testing phase, demonstrating better performance than the decision-making tree model. (4) Given the complexity of choosing tuning parameters, LR proved competitive when using medical datasets, which require strict accuracy.
(1) 医学研究越来越关注机器学习,从而可以进行大规模的多变量数据分析。因此,我们开发了药物中毒死亡率预测模型,并比较了机器学习模型和传统的逻辑回归模型。(2) 从韩国疾病控制和预防中心(2008-2017 年)中提取了 8937 个归类为药物中毒的样本。我们通过数据对每个模型进行了训练、验证和测试,并使用三个指标(Brier 得分、校准斜率和大校准)比较了它们的性能。(3) 卡方检验表明,死亡率风险根据严重程度、意图、有毒物质、年龄和性别而存在统计学差异。多层感知机模型(MLP)在训练和验证阶段的曲线下面积(AUC)最高,Brier 得分最低,而逻辑回归模型(LR)在测试阶段的 AUC(0.827)最高,Brier 得分(0.0307)最低。MLP 在测试阶段的 AUC(0.816)排名第二,Brier 得分(0.003258)排名第二,表现优于决策树模型。(4) 鉴于选择调整参数的复杂性,LR 在使用需要严格准确性的医疗数据集时表现出竞争力。