Wu Hang, Wang May D
Dept. of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332.
ACM BCB. 2017 Aug;2017:526-535. doi: 10.1145/3107411.3107447.
In biomedical data analysis, inferring the cause of death is a challenging and important task, which is useful for both public health reporting purposes, as well as improving patients' quality of care by identifying severer conditions. Causal inference, however, is notoriously difficult. Traditional causal inference mainly relies on analyzing data collected from experiment of specific design, which is expensive, and limited to a certain disease cohort, making the approach less generalizable. In our paper, we adopt a novel data-driven perspective to analyze and improve the death reporting process, to assist physicians identify the single underlying cause of death. To achieve this, we build state-of-the-art deep learning models, convolution neural network (CNN), and achieve around 75% accuracy in predicting the single underlying cause of death from a list of relevant medical conditions. We also provide interpretations for the black-box neural network models, so that death reporting physicians can apply the model with better understanding of the model.
在生物医学数据分析中,推断死因是一项具有挑战性且重要的任务,这对于公共卫生报告目的以及通过识别更严重的病情来改善患者的护理质量都很有用。然而,因果推断 notoriously difficult。传统的因果推断主要依赖于分析从特定设计的实验中收集的数据,这成本高昂且仅限于特定疾病队列,使得该方法的通用性较差。在我们的论文中,我们采用了一种新颖的数据驱动视角来分析和改进死亡报告流程,以帮助医生识别单一潜在死因。为实现这一目标,我们构建了最先进的深度学习模型——卷积神经网络(CNN),并在从相关医疗状况列表中预测单一潜在死因方面达到了约75%的准确率。我们还为黑箱神经网络模型提供了解释,以便死亡报告医生能够在更好地理解模型的情况下应用该模型。
“notoriously difficult”直译为“臭名昭著地困难”,这里意译为“极其困难”更符合语境。