Suppr超能文献

从临床病历和医院用药记录中提取副作用的与语言无关的药物警戒文本挖掘。

Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records.

机构信息

Clinical Pharmacology Unit, Zealand University Hospital, Roskilde, Denmark.

NNF Center for Protein Research, University of Copenhagen, Copenhagen, Denmark.

出版信息

Basic Clin Pharmacol Toxicol. 2022 Oct;131(4):282-293. doi: 10.1111/bcpt.13773. Epub 2022 Jul 26.

Abstract

We sought to craft a drug safety signalling pipeline associating latent information in clinical free text with exposures to single drugs and drug pairs. Data arose from 12 secondary and tertiary public hospitals in two Danish regions, comprising approximately half the Danish population. Notes were operationalised with a fastText embedding, based on which we trained 10 270 neural-network models (one for each distinct single-drug/drug-pair exposure) predicting the risk of exposure given an embedding vector. We included 2 905 251 admissions between May 2008 and June 2016, with 13 740 564 distinct drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1 184 340 (41%) admissions patients used ≥5 drugs concomitantly. A total of 10 788 259 clinical notes were included, with 179 441 739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. Sixteen (14%) of the 115 drug-pair signals were possible interactions, and two (1.7%) were known. In conclusion, we built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, predicting not the likely outcome of a range of exposures but also the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and can help leverage non-English free text for pharmacovigilance.

摘要

我们旨在构建一个药物安全信号管道,将临床自由文本中的潜在信息与单一药物和药物组合的暴露情况联系起来。数据来自丹麦两个地区的 12 家二级和三级公立医院,涵盖了大约一半的丹麦人口。注释是通过 fastText 嵌入来实现的,我们基于该嵌入训练了 10270 个神经网络模型(每个单一药物/药物组合暴露一个模型),预测给定嵌入向量的暴露风险。我们纳入了 2008 年 5 月至 2016 年 6 月期间的 2905251 次住院记录,共有 13740564 个不同的药物处方;处方中位数为 5(IQR:3-9),在 1184340(41%)次住院中,患者同时使用了≥5 种药物。共纳入 10788259 份临床记录,修剪后保留了 179441739 个标记。在审查的 345 个单一药物信号中,28 个(8.1%)代表可能未描述的关系;186 个(54%)信号具有临床意义。115 个药物组合信号中有 16 个(14%)可能是相互作用,2 个(1.7%)是已知的。总之,我们构建了一个无需人工整理即可挖掘自由文本信息与药物暴露之间关联的语言无关管道,预测的不是一系列暴露的可能结果,而是感兴趣的结果的可能暴露。我们的方法可能有助于克服依赖英语 curated 数据的文本挖掘方法的局限性,并有助于利用非英语自由文本进行药物警戒。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d01d/9541191/e50ca0bb81c8/BCPT-131-282-g004.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验