French Armed Forces Center for Epidemiology and Public Health (CESPA), SSA, Camp de Sainte Marthe, 13568, Marseille, France.
UMR VITROME, IRD, AP-HM, SSA, IHU-Méditerranée Infection, Aix Marseille Univ, 13005, Marseille, France.
BMC Med Inform Decis Mak. 2019 Mar 5;19(1):38. doi: 10.1186/s12911-019-0774-3.
When outbreak detection algorithms (ODAs) are considered individually, the task of outbreak detection can be seen as a classification problem and the ODA as a sensor providing a binary decision (outbreak yes or no) for each day of surveillance. When they are considered jointly (in cases where several ODAs analyze the same surveillance signal), the outbreak detection problem should be treated as a decision fusion (DF) problem of multiple sensors.
This study evaluated the benefit for a decisions support system of using DF methods (fusing multiple ODA decisions) compared to using a single method of outbreak detection. For each day, we merged the decisions of six ODAs using 5 DF methods (two voting methods, logistic regression, CART and Bayesian network - BN). Classical metrics of accuracy, prediction and timelines were used during the evaluation steps.
In our results, we observed the greatest gain (77%) in positive predictive value compared to the best ODA if we used DF methods with a learning step (BN, logistic regression, and CART).
To identify disease outbreaks in systems using several ODAs to analyze surveillance data, we recommend using a DF method based on a Bayesian network. This method is at least equivalent to the best of the algorithms considered, regardless of the situation faced by the system. For those less familiar with this kind of technique, we propose that logistic regression be used when a training dataset is available.
当单独考虑爆发检测算法(ODA)时,爆发检测任务可以视为分类问题,而 ODA 则作为传感器,为监测的每一天提供二元决策(爆发或不爆发)。当它们被联合考虑时(在多个 ODA 分析相同监测信号的情况下),爆发检测问题应被视为多个传感器的决策融合(DF)问题。
本研究评估了使用 DF 方法(融合多个 ODA 决策)为决策支持系统提供的益处,与使用单一爆发检测方法相比。对于每一天,我们使用 5 种 DF 方法(两种投票方法、逻辑回归、CART 和贝叶斯网络-BN)合并了六种 ODA 的决策。在评估步骤中使用了准确性、预测和时间线的经典指标。
在我们的结果中,如果我们使用具有学习步骤的 DF 方法(BN、逻辑回归和 CART),我们观察到与最佳 ODA 相比,阳性预测值的最大增益(77%)。
为了在使用多个 ODA 分析监测数据的系统中识别疾病爆发,我们建议使用基于贝叶斯网络的 DF 方法。该方法至少与所考虑的算法中的最佳算法相当,而与系统面临的情况无关。对于那些不太熟悉这种技术的人,我们建议在有训练数据集的情况下使用逻辑回归。