Appl Clin Inform. 2013 Feb 27;4(1):88-99. doi: 10.4338/ACI-2012-11-RA-0049. Print 2013.
We previously demonstrated that a general purpose text mining system, the Vaccine adverse event Text Mining (VaeTM) system, could be used to automatically classify reports of an-aphylaxis for post-marketing safety surveillance of vaccines.
To evaluate the ability of VaeTM to classify reports to the Vaccine Adverse Event Reporting System (VAERS) of possible Guillain-Barré Syndrome (GBS).
We used VaeTM to extract the key diagnostic features from the text of reports in VAERS. Then, we applied the Brighton Collaboration (BC) case definition for GBS, and an information retrieval strategy (i.e. the vector space model) to quantify the specific information that is included in the key features extracted by VaeTM and compared it with the encoded information that is already stored in VAERS as Medical Dictionary for Regulatory Activities (MedDRA) Preferred Terms (PTs). We also evaluated the contribution of the primary (diagnosis and cause of death) and secondary (second level diagnosis and symptoms) diagnostic VaeTM-based features to the total VaeTM-based information.
MedDRA captured more information and better supported the classification of reports for GBS than VaeTM (AUC: 0.904 vs. 0.777); the lower performance of VaeTM is likely due to the lack of extraction by VaeTM of specific laboratory results that are included in the BC criteria for GBS. On the other hand, the VaeTM-based classification exhibited greater specificity than the MedDRA-based approach (94.96% vs. 87.65%). Most of the VaeTM-based information was contained in the secondary diagnostic features.
For GBS, clinical signs and symptoms alone are not sufficient to match MedDRA coding for purposes of case classification, but are preferred if specificity is the priority.
我们之前已经证明,通用文本挖掘系统 Vaccine adverse event Text Mining (VaeTM) 系统可用于自动对疫苗上市后安全监测的过敏反应报告进行分类。
评估 VaeTM 对疫苗不良事件报告系统 (VAERS) 中可能的吉兰-巴雷综合征 (GBS) 报告进行分类的能力。
我们使用 VaeTM 从 VAERS 报告的文本中提取关键诊断特征。然后,我们应用 Brighton 合作组织 (BC) 的 GBS 病例定义以及信息检索策略(即向量空间模型),量化 VaeTM 提取的关键特征中包含的特定信息,并将其与已存储在 VAERS 中的 MedDRA 首选术语 (PT) 进行比较。我们还评估了基于诊断的初级(诊断和死因)和次要(二级诊断和症状)特征对基于 VaeTM 的总信息的贡献。
与 VaeTM 相比,MedDRA 能捕获更多的信息,更好地支持 GBS 报告的分类(AUC:0.904 对 0.777);VaeTM 的性能较低可能是因为 VaeTM 未提取包含在 GBS BC 标准中的特定实验室结果。另一方面,基于 VaeTM 的分类比基于 MedDRA 的方法具有更高的特异性(94.96% 对 87.65%)。基于 VaeTM 的信息主要包含在二级诊断特征中。
对于 GBS,仅临床体征和症状不足以匹配 MedDRA 编码,以进行病例分类,但如果优先考虑特异性,则更适合使用。