Japan Medical Data Center Co, Ltd, Tokyo, Japan.
J Epidemiol. 2010;20(5):413-9. doi: 10.2188/jea.je20090066. Epub 2010 Aug 7.
Health insurance claims (ie, receipts) record patient health care treatments and expenses and, although created for the health care payment system, are potentially useful for research. Combining different types of receipts generated for the same patient would dramatically increase the utility of these receipts. However, technical problems, including standardization of disease names and classifications, and anonymous linkage of individual receipts, must be addressed.
In collaboration with health insurance societies, all information from receipts (inpatient, outpatient, and pharmacy) was collected. To standardize disease names and classifications, we developed a computer-aided post-entry standardization method using a disease name dictionary based on International Classification of Diseases (ICD)-10 classifications. We also developed an anonymous linkage system by using an encryption code generated from a combination of hash values and stream ciphers. Using different sets of the original data (data set 1: insurance certificate number, name, and sex; data set 2: insurance certificate number, date of birth, and relationship status), we compared the percentage of successful record matches obtained by using data set 1 to generate key codes with the percentage obtained when both data sets were used.
The dictionary's automatic conversion of disease names successfully standardized 98.1% of approximately 2 million new receipts entered into the database. The percentage of anonymous matches was higher for the combined data sets (98.0%) than for data set 1 (88.5%).
The use of standardized disease classifications and anonymous record linkage substantially contributed to the construction of a large, chronologically organized database of receipts. This database is expected to aid in epidemiologic and health services research using receipt information.
医疗保险索赔(即收据)记录了患者的医疗保健治疗和费用,尽管这些收据是为医疗保健支付系统创建的,但它们对于研究具有潜在的用处。合并同一患者生成的不同类型的收据将极大地提高这些收据的效用。然而,必须解决包括疾病名称和分类的标准化以及单个收据的匿名链接等技术问题。
与医疗保险协会合作,收集了收据(住院、门诊和药房)中的所有信息。为了标准化疾病名称和分类,我们使用基于国际疾病分类(ICD-10)分类的疾病名称字典开发了一种计算机辅助的事后标准化方法。我们还开发了一种匿名链接系统,使用从哈希值和流密码组合生成的加密代码。使用原始数据的不同集合(数据集 1:保险证书编号、姓名和性别;数据集 2:保险证书编号、出生日期和关系状态),我们比较了使用数据集 1 生成关键代码时获得的记录匹配成功率百分比,以及同时使用两个数据集时获得的成功率百分比。
字典对疾病名称的自动转换成功地标准化了约 200 万新收据进入数据库的 98.1%。组合数据集的匿名匹配率(98.0%)高于数据集 1(88.5%)。
标准化疾病分类和匿名记录链接的使用极大地促进了收据的大型、按时间顺序组织的数据库的构建。预计该数据库将有助于使用收据信息进行流行病学和医疗服务研究。