Department of Epidemiology and Health Statistics, West China School of Public Health and West China Fourth Hospital, Sichuan University, Sichuan, China.
Department of Systems, Populations and Leadership, University of Michigan, School of Nursing, Ann Arbor, USA.
Sci Rep. 2019 Jul 17;9(1):10376. doi: 10.1038/s41598-019-46737-0.
The surveillance of infectious diseases relies on the identification of dynamic relations between the infectious diseases and corresponding influencing factors. However, the identification task confronts with two practical challenges: small sample size and delayed effect. To overcome both challenges to imporve the identification results, this study evaluated the performance of dynamic Bayesian network(DBN) in infectious diseases surveillance. Specifically, the evaluation was conducted by two simulations. The first simulation was to evaluate the performance of DBN by comparing it with the Granger causality test and the least absolute shrinkage and selection operator (LASSO) method; and the second simulation was to assess how the DBN could improve the forecasting ability of infectious diseases. In order to make both simulations close to the real-world situation as much as possible, their simulation scenarios were adapted from real-world studies, and practical issues such as nonlinearity and nuisance variables were also considered. The main simulation results were: ① When the sample size was large (n = 340), the true positive rates (TPRs) of DBN (≥98%) were slightly higher than those of the Granger causality method and approximately the same as those of the LASSO method; the false positive rates (FPRs) of DBN were averagely 46% less than those of the Granger causality test, and 22% less than those of the LASSO method. ② When the sample size was small, the main problem was low TPR, which would be further aggravated by the issues of nonlinearity and nuisance variables. In the worst situation (i.e., small sample size, nonlinearity and existence of nuisance variables), the TPR of DBN declined to 43.30%. However, it was worth noting that such decline could also be found in the corresponding results of Granger causality test and LASSO method. ③ Sample size was important for identifying the dynamic relations among multiple variables, in this case, at least three years of weekly historical data were needed to guarantee the quality of infectious diseases surveillance. ④ DBN could improve the foresting results through reducing forecasting errors by 7%. According to the above results, DBN is recommended to improve the quality of infectious diseases surveillance.
传染病监测依赖于识别传染病与相应影响因素之间的动态关系。然而,识别任务面临两个实际挑战:小样本量和延迟效应。为了克服这两个挑战,提高识别结果,本研究评估了动态贝叶斯网络(DBN)在传染病监测中的性能。具体来说,通过两个模拟来评估 DBN 的性能。第一个模拟通过与格兰杰因果检验和最小绝对收缩和选择算子(LASSO)方法进行比较来评估 DBN 的性能;第二个模拟评估 DBN 如何提高传染病的预测能力。为了使这两个模拟尽可能接近实际情况,它们的模拟场景是从实际研究中改编而来的,并且考虑了非线性和干扰变量等实际问题。主要模拟结果为:①当样本量较大(n=340)时,DBN 的真阳性率(TPR)(≥98%)略高于格兰杰因果法,大致与 LASSO 法相同;DBN 的假阳性率(FPR)平均比格兰杰因果检验低 46%,比 LASSO 法低 22%。②当样本量较小时,主要问题是低 TPR,非线性和干扰变量问题会进一步加剧这一问题。在最糟糕的情况下(即,小样本量、非线性和存在干扰变量),DBN 的 TPR 下降到 43.30%。然而,值得注意的是,格兰杰因果检验和 LASSO 法的相应结果也出现了这种下降。③样本量对于识别多个变量之间的动态关系很重要,在这种情况下,需要至少三年的每周历史数据来保证传染病监测的质量。④DBN 可以通过将预测误差降低 7%来提高森林预测结果。根据以上结果,建议使用 DBN 来提高传染病监测的质量。