Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA.
College of Pharmacy, University of Minnesota, Minneapolis, Minnesota, USA.
J Am Med Inform Assoc. 2021 Mar 1;28(3):569-577. doi: 10.1093/jamia/ocaa218.
We sought to demonstrate the feasibility of utilizing deep learning models to extract safety signals related to the use of dietary supplements (DSs) in clinical text.
Two tasks were performed in this study. For the named entity recognition (NER) task, Bi-LSTM-CRF (bidirectional long short-term memory conditional random field) and BERT (bidirectional encoder representations from transformers) models were trained and compared with CRF model as a baseline to recognize the named entities of DSs and events from clinical notes. In the relation extraction (RE) task, 2 deep learning models, including attention-based Bi-LSTM and convolutional neural network as well as a random forest model were trained to extract the relations between DSs and events, which were categorized into 3 classes: positive (ie, indication), negative (ie, adverse events), and not related. The best performed NER and RE models were further applied on clinical notes mentioning 88 DSs for discovering DSs adverse events and indications, which were compared with a DS knowledge base.
For the NER task, deep learning models achieved a better performance than CRF, with F1 scores above 0.860. The attention-based Bi-LSTM model performed the best in the RE task, with an F1 score of 0.893. When comparing DS event pairs generated by the deep learning models with the knowledge base for DSs and event, we found both known and unknown pairs.
Deep learning models can detect adverse events and indication of DSs in clinical notes, which hold great potential for monitoring the safety of DS use.
我们旨在证明利用深度学习模型从临床文本中提取与膳食补充剂(DS)使用相关的安全信号是可行的。
本研究进行了两项任务。在命名实体识别(NER)任务中,训练了 Bi-LSTM-CRF(双向长短期记忆条件随机场)和 BERT(来自转换器的双向编码器表示)模型,并与 CRF 模型进行了比较,以识别临床记录中 DS 的命名实体和事件。在关系提取(RE)任务中,训练了 2 种深度学习模型,包括基于注意力的 Bi-LSTM 和卷积神经网络以及随机森林模型,以提取 DS 和事件之间的关系,这些关系分为 3 类:阳性(即指示)、阴性(即不良事件)和不相关。表现最佳的 NER 和 RE 模型进一步应用于提到 88 种 DS 的临床记录,以发现 DS 的不良事件和适应症,并与 DS 知识库进行了比较。
在 NER 任务中,深度学习模型的表现优于 CRF,F1 分数高于 0.860。在 RE 任务中,基于注意力的 Bi-LSTM 模型表现最佳,F1 分数为 0.893。当将深度学习模型生成的 DS 事件对与 DS 和事件知识库进行比较时,我们发现了既有已知的也有未知的配对。
深度学习模型可以从临床记录中检测到 DS 的不良事件和适应症,这为监测 DS 使用的安全性提供了巨大的潜力。