Department of Electrical Engineering, College of Electrical Engineering and Computer Science, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan.
Department of Post-Baccalaureate Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan.
J Am Med Inform Assoc. 2020 Jan 1;27(1):47-55. doi: 10.1093/jamia/ocz120.
An adverse drug event (ADE) refers to an injury resulting from medical intervention related to a drug including harm caused by drugs or from the usage of drugs. Extracting ADEs from clinical records can help physicians associate adverse events to targeted drugs.
We proposed a cascading architecture to recognize medical concepts including ADEs, drug names, and entities related to drugs. The architecture includes a preprocessing method and an ensemble of conditional random fields (CRFs) and neural network-based models to respectively address the challenges of surrogate string and overlapping annotation boundaries observed in the employed ADEs and medication extraction (ADME) corpus. The effectiveness of applying different pretrained and postprocessed word embeddings for the ADME task was also studied.
The empirical results showed that both CRFs and neural network-based models provide promising solution for the ADME task. The neural network-based models particularly outperformed CRFs in concept types involving narrative descriptions. Our best run achieved an overall micro F-score of 0.919 on the employed corpus. Our results also suggested that the Global Vectors for word representation embedding in general domain provides a very strong baseline, which can be further improved by applying the principal component analysis to generate more isotropic vectors.
We have demonstrated that the proposed cascading architecture can handle the problem of overlapped annotations and further improve the overall recall and F-scores because the architecture enables the developed models to exploit more context information and forms an ensemble for creating a stronger recognizer.
药物不良事件(ADE)是指与药物相关的医疗干预引起的伤害,包括药物引起的伤害或药物使用引起的伤害。从临床记录中提取 ADE 可以帮助医生将不良事件与目标药物联系起来。
我们提出了一种级联架构来识别包括 ADE、药物名称和与药物相关的实体在内的医学概念。该架构包括预处理方法和条件随机场(CRF)和基于神经网络的模型的集合,分别解决了在使用的 ADE 和药物提取(ADME)语料库中观察到的替代字符串和重叠注释边界的挑战。还研究了应用不同的预训练和后处理词嵌入对 ADME 任务的有效性。
实验结果表明,CRF 和基于神经网络的模型都为 ADME 任务提供了有前途的解决方案。基于神经网络的模型在涉及叙述描述的概念类型中尤其优于 CRF。我们的最佳运行在使用的语料库上实现了总体微观 F 分数为 0.919。我们的结果还表明,用于一般领域的单词表示嵌入的 Global Vectors 提供了一个非常强大的基线,通过应用主成分分析生成更各向同性的向量可以进一步提高。
我们已经证明,所提出的级联架构可以处理重叠注释的问题,并进一步提高整体召回率和 F 分数,因为该架构使开发的模型能够利用更多的上下文信息,并形成一个集合来创建一个更强的识别器。