Sandberg Lovisa, Vidlin Sara Hedfors, K-Pápai Levente, Savage Ruth, Raemaekers Boukje C, Taavola-Gustafsson Henric, Rudolph Annette, Quirant Lucy, Bergvall Tomas, Wallberg Magnus, Ellenius Johan
Uppsala Monitoring Centre, Uppsala, Sweden.
New Zealand Pharmacovigilance Centre, University of Otago, Dunedin, New Zealand.
Drug Saf. 2025 Jun 23. doi: 10.1007/s40264-025-01559-0.
Information on the safety of medicine use during pregnancy is limited at the time of marketing, making post-marketing surveillance essential. However, the lack of a specific indicator for pregnancy-related case reports within the international standard for transmission of individual case safety reports complicates the retrieval of such reports in pharmacovigilance databases. To address this, an algorithm to identify reports of exposures during pregnancy was developed in VigiBase, the World Health Organization global database of adverse event reports.
We aimed to evaluate and characterise the VigiBase pregnancy algorithm.
The rule-based algorithm uses multiple structured data elements in the International Council of Harmonisation (ICH) E2B transmission format that could potentially hold pregnancy-related information, to determine if a case report qualifies as a pregnancy case. Free text information is not considered. Three datasets were used for the evaluation. The "Full dataset" comprised deduplicated VigiBase data up to January 2023. The "Downsampled dataset" was a subsample of the Full dataset, adjusted to increase the prevalence of pregnancy reports by excluding individuals aged 45 years or older and male individuals aged 18 years or older, used to evaluate recall (i.e. sensitivity). The "Random dataset" was a straight random sample of the Full dataset, used to evaluate precision (i.e. positive predictive value). As a baseline for comparison, the Standardised Medical Dictionary for Regulatory Activities (MedDRA) Query (SMQ) "Pregnancy and neonatal topics (narrow)" was used. To provide a gold standard for the evaluation, case reports were manually annotated as either "pregnancy case" or "non-pregnancy case", for all reports in the Downsampled dataset, and for the reports flagged as pregnancy cases by the algorithm or the SMQ baseline in the Random dataset.
In the Downsampled dataset with 7874 annotated reports, 253 reports were annotated as pregnancy cases. Of those, the algorithm recalled 75% (95% confidence interval [CI] 69-80), increasing to 91% (95% CI 86-95) when restricting the analysis to reports adhering to the ICH E2B format. Preprocessing obstacles of incomplete mapping of specific pregnancy terms to MedDRA led to most false negatives followed by pregnancy information confined to free text information. The SMQ baseline had a lower recall of 62% (95% CI 56-68). In the Random dataset with 30,000 reports, the algorithm flagged 344 reports, among which 316 were annotated as pregnancy cases, leading to a precision of 92% (95% CI 88-95). The main reasons for false positives were postpartum indications, non-pregnancy-specific events or information miscoded as pregnancy related. The SMQ baseline had a lower precision of 74% (95% CI 69-78).
The VigiBase pregnancy algorithm demonstrates robust performance, highlighting its potential to facilitate pharmacovigilance related to pregnancy. Our evaluation establishes a valuable benchmark for future research and emphasises the need for global harmonisation of standards for reporting pregnancy exposures.
药物在孕期使用的安全性信息在上市时有限,因此上市后监测至关重要。然而,在个体病例安全报告传输的国际标准中缺乏与妊娠相关病例报告的特定指标,这使得在药物警戒数据库中检索此类报告变得复杂。为解决这一问题,在世界卫生组织全球不良事件报告数据库VigiBase中开发了一种识别孕期暴露报告的算法。
我们旨在评估和描述VigiBase妊娠算法。
基于规则的算法使用国际人用药品注册技术协调会(ICH)E2B传输格式中的多个结构化数据元素,这些元素可能包含与妊娠相关的信息,以确定病例报告是否符合妊娠病例的标准。不考虑自由文本信息。使用了三个数据集进行评估。“完整数据集”包括截至2023年1月的去重VigiBase数据。“下采样数据集”是完整数据集的一个子样本,通过排除45岁及以上个体和18岁及以上男性个体进行调整,以提高妊娠报告的患病率,用于评估召回率(即敏感性)。“随机数据集”是完整数据集的直接随机样本,用于评估精确率(即阳性预测值)。作为比较的基线,使用了《监管活动医学词典》(MedDRA)查询(SMQ)“妊娠和新生儿主题(狭义)”。为提供评估的金标准,对下采样数据集中的所有报告以及随机数据集中被算法或SMQ基线标记为妊娠病例的报告,手动注释为“妊娠病例”或“非妊娠病例”。
在有7874份注释报告的下采样数据集中,253份报告被注释为妊娠病例。其中,算法召回率为75%(95%置信区间[CI]69 - 80),当将分析限制在符合ICH E2B格式的报告时,召回率提高到91%(95%CI 86 - 95)。特定妊娠术语到MedDRA的映射不完整的预处理障碍导致了大多数假阴性,其次是仅限于自由文本信息的妊娠信息。SMQ基线的召回率较低,为62%(95%CI 56 - 68)。在有30000份报告的随机数据集中,算法标记了344份报告,其中316份被注释为妊娠病例,精确率为92%(95%CI 88 - 95)。假阳性的主要原因是产后指征、非妊娠特异性事件或编码错误为与妊娠相关的信息。SMQ基线的精确率较低,为74%(95%CI 69 - 78)。
VigiBase妊娠算法表现出强大的性能,突出了其在促进与妊娠相关的药物警戒方面的潜力。我们的评估为未来研究建立了一个有价值的基准,并强调了全球统一妊娠暴露报告标准的必要性。