Office of Biostatistics and Epidemiology, Center for Biologics Evaluation and Research CBER, US FDA, Woodmont Office Complex 1, Rm 306N, 1401 Rockville Pike, Rockville, MD 20852, USA.
Drug Saf. 2013 Jul;36(7):573-82. doi: 10.1007/s40264-013-0064-4.
Automating the classification of adverse event reports is an important step to improve the efficiency of vaccine safety surveillance. Previously we showed it was possible to classify reports using features extracted from the text of the reports.
The aim of this study was to use the information encoded in the Medical Dictionary for Regulatory Activities (MedDRA(®)) in the US Vaccine Adverse Event Reporting System (VAERS) to support and evaluate two classification approaches: a multiple information retrieval strategy and a rule-based approach. To evaluate the performance of these approaches, we selected the conditions of anaphylaxis and Guillain-Barré syndrome (GBS).
We used MedDRA(®) Preferred Terms stored in the VAERS, and two standardized medical terminologies: the Brighton Collaboration (BC) case definitions and Standardized MedDRA(®) Queries (SMQ) to classify two sets of reports for GBS and anaphylaxis. Two approaches were used: (i) the rule-based instruments that are available by the two terminologies (the Automatic Brighton Classification [ABC] tool and the SMQ algorithms); and (ii) the vector space model.
We found that the rule-based instruments, particularly the SMQ algorithms, achieved a high degree of specificity; however, there was a cost in terms of sensitivity in all but the narrow GBS SMQ algorithm that outperformed the remaining approaches (sensitivity in the testing set was equal to 99.06 % for this algorithm vs. 93.40 % for the vector space model). In the case of anaphylaxis, the vector space model achieved higher sensitivity compared with the best values of both the ABC tool and the SMQ algorithms in the testing set (86.44 % vs. 64.11 % and 52.54 %, respectively).
Our results showed the superiority of the vector space model over the existing rule-based approaches irrespective of the standardized medical knowledge represented by either the SMQ or the BC case definition. The vector space model might make automation of case definitions for spontaneous report review more efficient than current rule-based approaches, allowing more time for critical assessment and decision making by pharmacovigilance experts.
自动化不良事件报告分类是提高疫苗安全监测效率的重要步骤。此前我们已经证明,使用报告文本中提取的特征进行报告分类是可行的。
本研究旨在使用美国疫苗不良事件报告系统(VAERS)中编码的医疗字典(MedDRA(®))信息,来支持和评估两种分类方法:多信息检索策略和基于规则的方法。为了评估这些方法的性能,我们选择了过敏反应和格林-巴利综合征(GBS)的条件。
我们使用了 VAERS 中存储的 MedDRA(®)首选术语,以及两种标准化医学术语:布莱顿合作组织(BC)病例定义和标准化 MedDRA(®)查询(SMQ),对两组 GBS 和过敏反应报告进行分类。我们使用了两种方法:(i)两种术语可用的基于规则的工具(自动布莱顿分类[ABC]工具和 SMQ 算法);和(ii)向量空间模型。
我们发现,基于规则的工具,特别是 SMQ 算法,具有很高的特异性;然而,除了窄 GBS SMQ 算法之外,所有方法的敏感性都有所降低,该算法在测试集中的表现优于其他方法(该算法的敏感性与向量空间模型的敏感性相等,为 99.06%)。在过敏反应的情况下,与 ABC 工具和 SMQ 算法在测试集中的最佳值相比,向量空间模型具有更高的敏感性(分别为 86.44%、64.11%和 52.54%)。
无论使用的是 SMQ 还是 BC 病例定义所代表的标准化医学知识,我们的结果都表明向量空间模型优于现有的基于规则的方法。向量空间模型可能使自发报告审查的病例定义自动化比当前基于规则的方法更有效,为药物警戒专家提供更多的时间进行关键评估和决策。