Department of Statistics, Columbia University, New York, USA.
Stat Methods Med Res. 2013 Feb;22(1):39-56. doi: 10.1177/0962280211403602. Epub 2011 Aug 30.
Data mining disproportionality methods (PRR, ROR, EBGM, IC, etc.) are commonly used to identify drug safety signals in spontaneous report system (SRS) databases. Newer data sources such as longitudinal observational databases (LOD) provide time-stamped patient-level information and overcome some of the SRS limitations such as an absence of the denominator, total number of patients who consume a drug, and limited temporal information. Application of the disproportionality methods to LODs has not been widely explored. The scale of the LOD data provides an interesting computational challenge. Larger health claims databases contain information on more than 50 million patients and each patient has records for up to 10 years. In this article we systematically explore the application of commonly used disproportionality methods to simulated and real LOD data.
数据挖掘不均衡性方法(PRR、ROR、EBGM、IC 等)通常用于识别自发报告系统(SRS)数据库中的药物安全信号。较新的数据来源,如纵向观察数据库(LOD),提供了带有时间戳的患者级信息,并克服了 SRS 的一些限制,例如缺乏分母、服用药物的总患者人数以及有限的时间信息。不均衡性方法在 LOD 中的应用尚未得到广泛探索。LOD 数据的规模带来了有趣的计算挑战。更大的健康索赔数据库包含超过 5000 万患者的信息,每个患者的记录长达 10 年。在本文中,我们系统地探讨了常用不均衡性方法在模拟和真实 LOD 数据中的应用。