Hennessy Sean, Bilker Warren B, Weber Anita, Strom Brian L
Center for Clinical Epidemiology, Department of Biostatistics, University of Pennsylvania School of Medicine, Philadelphia, PA, USA.
Pharmacoepidemiol Drug Saf. 2003 Mar;12(2):103-11. doi: 10.1002/pds.765.
To examine the integrity of six Medicaid databases for use in pharmacoepidemiology research.
We performed descriptive analyses to examine four categories of potential data errors: incomplete claims for certain time periods; absence of an accurate indicator of inpatient hospitalizations; missing hospitalizations for those aged 65 years and over; and diagnostic codes in demographic groups in which those conditions should be rare.
Prescription claims appeared to be missing intermittently in some states. No valid marker of inpatient hospitalizations could be found for three of six states. Hospitalizations appeared to be missing to varying degrees for those aged 65 years and over. Gross errors in diagnostic codes and demographic data did not appear to be widespread.
Whenever possible, investigators using administrative data should perform macro-level descriptive analyses on the parent data set. In particular, researchers should examine the number of medical and pharmacy claims over time, looking for gaps. Validity of markers of hospitalization should be assessed. The accuracy of diagnosis and demographic data should be examined. Such a descriptive macro-level approach should be used to supplement, and perhaps precede validation of study outcomes using clinical records.
检查六个医疗补助数据库在药物流行病学研究中的完整性。
我们进行了描述性分析,以检查四类潜在的数据错误:特定时间段内的索赔不完整;缺乏住院治疗的准确指标;65岁及以上人群的住院记录缺失;以及在某些疾病应罕见的人群组中的诊断编码。
在一些州,处方索赔似乎间歇性缺失。六个州中有三个州找不到有效的住院治疗标记。65岁及以上人群的住院记录似乎有不同程度的缺失。诊断编码和人口统计数据中的重大错误似乎并不普遍。
只要有可能,使用行政数据的研究人员应对母数据集进行宏观层面的描述性分析。特别是,研究人员应检查随时间推移的医疗和药房索赔数量,寻找差距。应评估住院治疗标记的有效性。应检查诊断和人口统计数据的准确性。这种描述性的宏观层面方法应用于补充,或许还应先于使用临床记录对研究结果进行验证。