Bosnjak Michael, Dahm Stefan, Kuhnert Ronny, Weihrauch Dennis, Rosario Angelika Schaffrath, Hurraß Julia, Schmich Patrick, Wieler Lothar H
Trier University, Department for Psychological Research Methods, Trier, Germany.
Robert Koch Institute, Department of Epidemiology and Health Monitoring, Berlin, Germany.
J Health Monit. 2024 Jun 19;9(2):e12100. doi: 10.25646/12100. eCollection 2024 Jun.
Some COVID-19 testing centres have reported manipulated test numbers for antigen tests/rapid tests. This study compares statistical approaches with traditional fraud detection methods. The extent of agreement between traditional and statistical methods was analysed, as well as the extent to which statistical approaches can identify additional cases of potential fraud.
Outlier detection marking a high number of tests, modeling of the positivity rate (Poisson Regression), deviation from distributional assumptions regarding the first digit (Benford's Law) and the last digit of the number of reported tests. The basis of the analyses were billing data (April 2021 to August 2022) from 907 testing centres in a German city.
The positive agreement between the conventional and statistical approaches ('sensitivity') was between 8.6% and 24.7%, the negative agreement ('specificity') was between 91.3% and 94.6%. The proportion of potentially fraudulent testing centres additionally identified by statistical approaches was between 7.0% and 8.7%. The combination of at least two statistical methods resulted in an optimal detection rate of test centres with previously undetected initial suspicion.
The statistical approaches were more effective and systematic in identifying potentially fraudulent testing centres than the conventional methods. Testing centres should be urged to map paradata (e.g. timestamps of testing) in future pandemics.
一些新冠病毒检测中心报告了抗原检测/快速检测中存在操纵检测数据的情况。本研究将统计方法与传统欺诈检测方法进行了比较。分析了传统方法与统计方法之间的一致程度,以及统计方法能够识别出潜在欺诈额外案例的程度。
通过异常值检测标记大量检测数据、阳性率建模(泊松回归)、偏离报告检测数量首位数字(本福特定律)和末位数字的分布假设。分析的基础是德国一个城市907个检测中心的计费数据(2021年4月至2022年8月)。
传统方法与统计方法之间的阳性一致性(“灵敏度”)在8.6%至24.7%之间,阴性一致性(“特异度”)在91.3%至94.6%之间。通过统计方法额外识别出的潜在欺诈检测中心比例在7.0%至8.7%之间。至少两种统计方法的组合导致对先前未被初步怀疑的检测中心的最佳检测率。
在识别潜在欺诈检测中心方面,统计方法比传统方法更有效、更系统。应敦促检测中心在未来疫情期间记录辅助数据(如检测时间戳)。