University of Trento, Trento, Italy.
Fondazione Bruno Kessler Research Institute, Trento, Italy.
PLoS One. 2020 Jul 2;15(7):e0235424. doi: 10.1371/journal.pone.0235424. eCollection 2020.
Progress of machine learning in critical care has been difficult to track, in part due to absence of public benchmarks. Other fields of research (such as computer vision and natural language processing) have established various competitions and public benchmarks. Recent availability of large clinical datasets has enabled the possibility of establishing public benchmarks. Taking advantage of this opportunity, we propose a public benchmark suite to address four areas of critical care, namely mortality prediction, estimation of length of stay, patient phenotyping and risk of decompensation. We define each task and compare the performance of both clinical models as well as baseline and deep learning models using eICU critical care dataset of around 73,000 patients. This is the first public benchmark on a multi-centre critical care dataset, comparing the performance of clinical gold standard with our predictive model. We also investigate the impact of numerical variables as well as handling of categorical variables on each of the defined tasks. The source code, detailing our methods and experiments is publicly available such that anyone can replicate our results and build upon our work.
机器学习在重症监护领域的进展一直难以追踪,部分原因是缺乏公共基准。其他研究领域(如计算机视觉和自然语言处理)已经建立了各种竞赛和公共基准。最近大型临床数据集的出现使得建立公共基准成为可能。我们利用这个机会,提出了一个公共基准套件,以解决重症监护的四个领域,即死亡率预测、住院时间估计、患者表型和失代偿风险。我们定义了每个任务,并使用大约 73000 名患者的 eICU 重症监护数据集比较了临床模型以及基线和深度学习模型的性能。这是第一个在多中心重症监护数据集上的公共基准,比较了临床黄金标准与我们的预测模型的性能。我们还研究了数值变量的影响以及对每个定义任务的分类变量的处理。详细说明我们的方法和实验的源代码是公开的,以便任何人都可以复制我们的结果并在此基础上进行构建。