Dimitsaki Stella, Natsiavas Pantelis, Jaulent Marie-Christine
Laboratoire d'Informatique Médicale et d'Ingénierie des Connaissances en e-Santé - LIMICS, Inserm, Université Sorbonne Paris-Nord, Sorbonne Université, Paris, France.
Centre for Research and Development Hellas, Institute of Applied Biosciences, Thessaloniki, Greece.
J Med Internet Res. 2024 Dec 30;26:e57824. doi: 10.2196/57824.
Artificial intelligence (AI) applied to real-world data (RWD; eg, electronic health care records) has been identified as a potentially promising technical paradigm for the pharmacovigilance field. There are several instances of AI approaches applied to RWD; however, most studies focus on unstructured RWD (conducting natural language processing on various data sources, eg, clinical notes, social media, and blogs). Hence, it is essential to investigate how AI is currently applied to structured RWD in pharmacovigilance and how new approaches could enrich the existing methodology.
This scoping review depicts the emerging use of AI on structured RWD for pharmacovigilance purposes to identify relevant trends and potential research gaps.
The scoping review methodology is based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) methodology. We queried the MEDLINE database through the PubMed search engine. Relevant scientific manuscripts published from January 2010 to January 2024 were retrieved. The included studies were "mapped" against a set of evaluation criteria, including applied AI approaches, code availability, description of the data preprocessing pipeline, clinical validation of AI models, and implementation of trustworthy AI criteria following the guidelines of the FUTURE (Fairness, Universality, Traceability, Usability, Robustness, and Explainability)-AI initiative.
The scoping review ultimately yielded 36 studies. There has been a significant increase in relevant studies after 2019. Most of the articles focused on adverse drug reaction detection procedures (23/36, 64%) for specific adverse effects. Furthermore, a substantial number of studies (34/36, 94%) used nonsymbolic AI approaches, emphasizing classification tasks. Random forest was the most popular machine learning approach identified in this review (17/36, 47%). The most common RWD sources used were electronic health care records (28/36, 78%). Typically, these data were not available in a widely acknowledged data model to facilitate interoperability, and they came from proprietary databases, limiting their availability for reproducing results. On the basis of the evaluation criteria classification, 10% (4/36) of the studies published their code in public registries, 16% (6/36) tested their AI models in clinical environments, and 36% (13/36) provided information about the data preprocessing pipeline. In addition, in terms of trustworthy AI, 89% (32/36) of the studies followed at least half of the trustworthy AI initiative guidelines. Finally, selection and confounding biases were the most common biases in the included studies.
AI, along with structured RWD, constitutes a promising line of work for drug safety and pharmacovigilance. However, in terms of AI, some approaches have not been examined extensively in this field (such as explainable AI and causal AI). Moreover, it would be helpful to have a data preprocessing protocol for RWD to support pharmacovigilance processes. Finally, because of personal data sensitivity, evaluation procedures have to be investigated further.
人工智能(AI)应用于真实世界数据(RWD,如电子医疗记录)已被视为药物警戒领域一种颇具潜力的技术范式。有若干将AI方法应用于RWD的实例;然而,大多数研究聚焦于非结构化RWD(对各种数据源进行自然语言处理,如临床记录、社交媒体和博客)。因此,研究AI当前如何应用于药物警戒中的结构化RWD以及新方法如何丰富现有方法至关重要。
本范围综述描述了AI在结构化RWD用于药物警戒目的方面的新兴应用,以识别相关趋势和潜在研究差距。
范围综述方法基于PRISMA(系统评价和Meta分析的首选报告项目)方法。我们通过PubMed搜索引擎查询MEDLINE数据库。检索了2010年1月至2024年1月发表的相关科学手稿。纳入的研究根据一组评估标准进行“映射”,包括应用的AI方法、代码可用性、数据预处理流程描述、AI模型的临床验证以及遵循FUTURE(公平性、通用性、可追溯性、可用性、稳健性和可解释性)-AI倡议指南实施可信AI标准。
范围综述最终产生36项研究。2019年后相关研究显著增加。大多数文章聚焦于特定不良反应的药物不良反应检测程序(23/36,64%)。此外,大量研究(34/36,94%)使用非符号AI方法,强调分类任务。随机森林是本综述中最受欢迎的机器学习方法(17/36,47%)。最常用的RWD来源是电子医疗记录(28/36,78%)。通常,这些数据没有以广泛认可的数据模型提供以促进互操作性,并且它们来自专有数据库,限制了其用于重现结果的可用性。根据评估标准分类,10%(4/36)的研究在公共注册库中发布了代码,16%(6/36)在临床环境中测试了其AI模型,36%(13/36)提供了有关数据预处理流程的信息。此外,在可信AI方面,89%(32/36)的研究遵循了至少一半可信AI倡议指南。最后,选择偏倚和混杂偏倚是纳入研究中最常见的偏倚。
AI与结构化RWD一起构成了药物安全和药物警戒的一条有前景的工作路线。然而,就AI而言,一些方法在该领域尚未得到广泛研究(如可解释AI和因果AI)。此外,拥有一个用于RWD的数据预处理协议以支持药物警戒流程将是有帮助的。最后,由于个人数据敏感性,评估程序必须进一步研究。