Park Junghyun A, Kim Minki, Yoon Seokjoon
Minki Kim, SUPEX Hall 304, 85 Hoegiro, Dongdaemun-Gu, Seoul 130-722, Republic of Korea, Email:
Methods Inf Med. 2016 May 17;55(3):284-91. doi: 10.3414/ME15-01-0076. Epub 2016 Apr 20.
Sophisticated anti-fraud systems for the healthcare sector have been built based on several statistical methods. Although existing methods have been developed to detect fraud in the healthcare sector, these algorithms consume considerable time and cost, and lack a theoretical basis to handle large-scale data.
Based on mathematical theory, this study proposes a new approach to using Benford's Law in that we closely examined the individual-level data to identify specific fees for in-depth analysis.
We extended the mathematical theory to demonstrate the manner in which large-scale data conform to Benford's Law. Then, we empirically tested its applicability using actual large-scale healthcare data from Korea's Health Insurance Review and Assessment (HIRA) National Patient Sample (NPS). For Benford's Law, we considered the mean absolute deviation (MAD) formula to test the large-scale data.
We conducted our study on 32 diseases, comprising 25 representative diseases and 7 DRG-regulated diseases. We performed an empirical test on 25 diseases, showing the applicability of Benford's Law to large-scale data in the healthcare industry. For the seven DRG-regulated diseases, we examined the individual-level data to identify specific fees to carry out an in-depth analysis. Among the eight categories of medical costs, we considered the strength of certain irregularities based on the details of each DRG-regulated disease.
Using the degree of abnormality, we propose priority action to be taken by government health departments and private insurance institutions to bring unnecessary medical expenses under control. However, when we detect deviations from Benford's Law, relatively high contamination ratios are required at conventional significance levels.
基于多种统计方法构建了复杂的医疗保健行业反欺诈系统。尽管现有方法已被开发用于检测医疗保健行业中的欺诈行为,但这些算法耗时且成本高昂,并且缺乏处理大规模数据的理论基础。
基于数学理论,本研究提出一种运用本福特定律的新方法,即我们仔细检查个体层面的数据以识别特定费用进行深入分析。
我们扩展了数学理论以证明大规模数据符合本福特定律的方式。然后,我们使用来自韩国健康保险审查与评估(HIRA)全国患者样本(NPS)的实际大规模医疗保健数据对其适用性进行了实证检验。对于本福特定律,我们考虑使用平均绝对偏差(MAD)公式来检验大规模数据。
我们对32种疾病进行了研究,包括25种代表性疾病和7种按疾病诊断相关分组(DRG)管理的疾病。我们对25种疾病进行了实证检验,表明本福特定律适用于医疗保健行业的大规模数据。对于7种按DRG管理的疾病,我们检查了个体层面的数据以识别特定费用进行深入分析。在八类医疗费用中,我们根据每种按DRG管理疾病的细节考虑了某些违规行为的严重程度。
利用异常程度,我们提出政府卫生部门和私人保险机构应采取优先行动以控制不必要的医疗费用。然而,当我们检测到与本福特定律的偏差时,在传统显著性水平下需要相对较高的污染率。