将机器学习算法应用于症状以确定儿童死亡的传染性病因的比较：印度百万死亡研究中对 18000 例口头尸检的全国性调查。

Comparison of machine learning algorithms applied to symptoms to determine infectious causes of death in children: national survey of 18,000 verbal autopsies in the Million Death Study in India.

机构信息

Biomedical Informatics Centre, Indian Council of Medical Research-National Institute for Research in Reproductive Health, Mumbai, 400012, India.

Centre for Global Health Research, St. Michael's Hospital, Unity Health Toronto, and Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.

出版信息

BMC Public Health. 2021 Oct 4;21(1):1787. doi: 10.1186/s12889-021-11829-y.

DOI:10.1186/s12889-021-11829-y

PMID:34607591

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8488544/

Abstract

BACKGROUND

Machine learning (ML) algorithms have been successfully employed for prediction of outcomes in clinical research. In this study, we have explored the application of ML-based algorithms to predict cause of death (CoD) from verbal autopsy records available through the Million Death Study (MDS).

METHODS

From MDS, 18826 unique childhood deaths at ages 1-59 months during the time period 2004-13 were selected for generating the prediction models of which over 70% of deaths were caused by six infectious diseases (pneumonia, diarrhoeal diseases, malaria, fever of unknown origin, meningitis/encephalitis, and measles). Six popular ML-based algorithms such as support vector machine, gradient boosting modeling, C5.0, artificial neural network, k-nearest neighbor, classification and regression tree were used for building the CoD prediction models.

RESULTS

SVM algorithm was the best performer with a prediction accuracy of over 0.8. The highest accuracy was found for diarrhoeal diseases (accuracy = 0.97) and the lowest was for meningitis/encephalitis (accuracy = 0.80). The top signs/symptoms for classification of these CoDs were also extracted for each of the diseases. A combination of signs/symptoms presented by the deceased individual can effectively lead to the CoD diagnosis.

CONCLUSIONS

Overall, this study affirms that verbal autopsy tools are efficient in CoD diagnosis and that automated classification parameters captured through ML could be added to verbal autopsies to improve classification of causes of death.

摘要

背景

机器学习（ML）算法已成功应用于临床研究结果预测。在这项研究中，我们探索了将基于 ML 的算法应用于通过百万死亡研究（MDS）获得的死因推断记录来预测死因（CoD）。

方法

从 MDS 中，选择了 2004-13 年期间年龄在 1-59 个月的 18826 例独特的儿童死亡记录，用于生成预测模型，其中超过 70%的死亡是由六种传染病（肺炎、腹泻病、疟疾、原因不明发热、脑膜炎/脑炎和麻疹）引起的。使用了六种流行的基于 ML 的算法，如支持向量机、梯度提升建模、C5.0、人工神经网络、k-最近邻、分类和回归树，用于构建 CoD 预测模型。

结果

SVM 算法表现最佳，预测准确率超过 0.8。腹泻病的准确率最高（准确率=0.97），脑膜炎/脑炎的准确率最低（准确率=0.80）。还为每种疾病提取了用于分类这些 CoD 的主要症状/体征。死者个体呈现的症状/体征组合可有效导致 CoD 诊断。

结论

总的来说，这项研究证实了死因推断工具在 CoD 诊断方面的有效性，并且可以通过 ML 自动捕获的分类参数来改进死因分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e215/8489070/d3688eeabe65/12889_2021_11829_Fig1_HTML.jpg

相似文献

Comparison of machine learning algorithms applied to symptoms to determine infectious causes of death in children: national survey of 18,000 verbal autopsies in the Million Death Study in India.

BMC Public Health. 2021 Oct 4;21(1):1787. doi: 10.1186/s12889-021-11829-y.

Performance criteria for verbal autopsy-based systems to estimate national causes of death: development and application to the Indian Million Death Study.

BMC Med. 2014 Feb 4;12:21. doi: 10.1186/1741-7015-12-21.

Automatically determining cause of death from verbal autopsy narratives.

BMC Med Inform Decis Mak. 2019 Jul 9;19(1):127. doi: 10.1186/s12911-019-0841-9.

Performance evaluation of machine learning and Computer Coded Verbal Autopsy (CCVA) algorithms for cause of death determination: A comparative analysis of data from rural South Africa.

Front Public Health. 2022 Sep 27;10:990838. doi: 10.3389/fpubh.2022.990838. eCollection 2022.

Naive Bayes classifiers for verbal autopsies: comparison to physician-based classification for 21,000 child and adult deaths.

BMC Med. 2015 Nov 25;13:286. doi: 10.1186/s12916-015-0521-2.

Automated versus physician assignment of cause of death for verbal autopsies: randomized trial of 9374 deaths in 117 villages in India.

BMC Med. 2019 Jun 27;17(1):116. doi: 10.1186/s12916-019-1353-2.

Causes of death in two rural demographic surveillance sites in Bangladesh, 2004-2010: automated coding of verbal autopsies using InterVA-4.

Glob Health Action. 2014 Oct 29;7:25511. doi: 10.3402/gha.v7.25511. eCollection 2014.

Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.

Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023.

Chronic diseases now a leading cause of death in rural India--mortality data from the Andhra Pradesh Rural Health Initiative.

Int J Epidemiol. 2006 Dec;35(6):1522-9. doi: 10.1093/ije/dyl168. Epub 2006 Sep 22.

Non-communicable diseases are the leading cause of mortality in rural Birbhum, West Bengal, India: a sex-stratified analysis of verbal autopsies from a prospective cohort, 2012-2017.

BMJ Open. 2020 Oct 23;10(10):e036578. doi: 10.1136/bmjopen-2019-036578.

引用本文的文献

Performance evaluation of machine learning and Computer Coded Verbal Autopsy (CCVA) algorithms for cause of death determination: A comparative analysis of data from rural South Africa.

Front Public Health. 2022 Sep 27;10:990838. doi: 10.3389/fpubh.2022.990838. eCollection 2022.

本文引用的文献

A machine learning approach towards the prediction of protein-ligand binding affinity based on fundamental molecular properties.

RSC Adv. 2018 Mar 28;8(22):12127-12137. doi: 10.1039/c8ra00003d. eCollection 2018 Mar 26.

An artificial neural network approach to detect presence and severity of Parkinson's disease via gait parameters.

PLoS One. 2021 Feb 19;16(2):e0244396. doi: 10.1371/journal.pone.0244396. eCollection 2021.

Development of a Methodology Using Artificial Neural Network in the Detection and Diagnosis of Faults for Pneumatic Control Valves.

Sensors (Basel). 2021 Jan 27;21(3):853. doi: 10.3390/s21030853.

Classification of Biodegradable Substances Using Balanced Random Trees and Boosted C5.0 Decision Trees.

Int J Environ Res Public Health. 2020 Dec 13;17(24):9322. doi: 10.3390/ijerph17249322.

Improving an Intelligent Detection System for Coronary Heart Disease Using a Two-Tier Classifier Ensemble.

Biomed Res Int. 2020 Apr 27;2020:9816142. doi: 10.1155/2020/9816142. eCollection 2020.

Comparing different supervised machine learning algorithms for disease prediction.

BMC Med Inform Decis Mak. 2019 Dec 21;19(1):281. doi: 10.1186/s12911-019-1004-8.

Overview of artificial neural network models in the biomedical domain.

Bratisl Lek Listy. 2019;120(7):536-540. doi: 10.4149/BLL_2019_087.

Modelling of Asphalt's Adhesive Behaviour Using Classification and Regression Tree (CART) Analysis.

Comput Intell Neurosci. 2019 Aug 15;2019:3183050. doi: 10.1155/2019/3183050. eCollection 2019.

Support Vector Machine model for hERG inhibitory activities based on the integrated hERG database using descriptor selection by NSGA-II.

Sci Rep. 2019 Aug 21;9(1):12220. doi: 10.1038/s41598-019-47536-3.

Automated versus physician assignment of cause of death for verbal autopsies: randomized trial of 9374 deaths in 117 villages in India.

BMC Med. 2019 Jun 27;17(1):116. doi: 10.1186/s12916-019-1353-2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

将机器学习算法应用于症状以确定儿童死亡的传染性病因的比较：印度百万死亡研究中对 18000 例口头尸检的全国性调查。

Comparison of machine learning algorithms applied to symptoms to determine infectious causes of death in children: national survey of 18,000 verbal autopsies in the Million Death Study in India.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献