Szarfman Ana, Machado Stella G, O'Neill Robert T
Office of Biostatistics, Center for Drug Evaluation and Research, Food and Drug Administration, Rockville, Maryland 20857, USA.
Drug Saf. 2002;25(6):381-92. doi: 10.2165/00002018-200225060-00001.
Since 1998, the US Food and Drug Administration (FDA) has been exploring new automated and rapid Bayesian data mining techniques. These techniques have been used to systematically screen the FDA's huge MedWatch database of voluntary reports of adverse drug events for possible events of concern. The data mining method currently being used is the Multi-Item Gamma Poisson Shrinker (MGPS) program that replaced the Gamma Poisson Shrinker (GPS) program we originally used with the legacy database. The MGPS algorithm, the technical aspects of which are summarised in this paper, computes signal scores for pairs, and for higher-order (e.g. triplet, quadruplet) combinations of drugs and events that are significantly more frequent than their pair-wise associations would predict. MGPS generates consistent, redundant, and replicable signals while minimising random patterns. Signals are generated without using external exposure data, adverse event background information, or medical information on adverse drug reactions. The MGPS interface streamlines multiple input-output processes that previously had been manually integrated. The system, however, cannot distinguish between already-known associations and new associations, so the reviewers must filter these events. In addition to detecting possible serious single-drug adverse event problems, MGPS is currently being evaluated to detect possible synergistic interactions between drugs (drug interactions) and adverse events (syndromes), and to detect differences among subgroups defined by gender and by age, such as paediatrics and geriatrics. In the current data, only 3.4% of all 1.2 million drug-event pairs ever reported (with frequencies > or = 1) generate signals [lower 95% confidence interval limit of the adjusted ratios of the observed counts over expected (O/E) counts (denoted EB05) of > or = 2]. The total frequency count that contributed to signals comprised 23% (2.4 million) of the total number, 10.4 million of drug-event pairs reported, greatly facilitating a more focused follow-up and evaluation. The algorithm provides an objective, systematic view of the data alerting reviewers to critically important, new safety signals. The study of signals detected by current methods, signals stored in the Center for Drug Evaluation and Research's Monitoring Adverse Reports Tracking System, and the signals regarding cerivastatin, a cholesterol-lowering drug voluntarily withdrawn from the market in August 2001, exemplify the potential of data mining to improve early signal detection. The operating characteristics of data mining in detecting early safety signals, exemplified by studying a drug recently well characterised by large clinical trials confirms our experience that the signals generated by data mining have high enough specificity to deserve further investigation. The application of these tools may ultimately improve usage recommendations.
自1998年以来,美国食品药品监督管理局(FDA)一直在探索新的自动化快速贝叶斯数据挖掘技术。这些技术已被用于系统筛选FDA庞大的MedWatch自愿报告不良药物事件数据库,以寻找可能值得关注的事件。目前使用的数据挖掘方法是多项目伽马泊松收缩器(MGPS)程序,它取代了我们最初在旧数据库中使用的伽马泊松收缩器(GPS)程序。本文总结了MGPS算法的技术要点,该算法可计算药物与事件的配对以及高阶(如三联体、四联体)组合的信号分数,这些组合出现的频率显著高于其配对关联所预测的频率。MGPS在最小化随机模式的同时生成一致、冗余且可复制的信号。生成信号时无需使用外部暴露数据、不良事件背景信息或药物不良反应的医学信息。MGPS界面简化了以前手动整合的多个输入输出过程。然而,该系统无法区分已知关联和新关联,因此评审人员必须对这些事件进行筛选。除了检测可能的严重单药不良事件问题外,目前正在评估MGPS以检测药物之间可能的协同相互作用(药物相互作用)和不良事件(综合征),并检测按性别和年龄定义的亚组之间的差异,如儿科和老年医学。在当前数据中,在所有报告的120万对药物 - 事件对(频率>或 = 1)中,只有3.4%产生信号[观察计数与预期计数(O/E)的调整后比率的下限95%置信区间极限(表示为EB05)>或 = 2]。促成信号的总频率计数占报告的药物 - 事件对总数1040万的23%(240万),极大地便于更有针对性的后续跟进和评估。该算法提供了对数据的客观、系统的视图,提醒评审人员注意极其重要的新安全信号。对当前方法检测到的信号、存储在药物评价和研究中心监测不良报告跟踪系统中的信号以及关于西立伐他汀(一种于2001年8月自愿退市的降胆固醇药物)的信号的研究,例证了数据挖掘在改善早期信号检测方面的潜力。通过研究一种最近在大型临床试验中得到充分表征的药物所例证的数据挖掘在检测早期安全信号方面的操作特征,证实了我们的经验,即数据挖掘产生的信号具有足够高的特异性,值得进一步研究。这些工具的应用最终可能会改进使用建议。