Suppr超能文献

实用分析与真实临床文本论证:以日本癌症相关电子病历为例

Utility analysis and demonstration of real-world clinical texts: A case study on Japanese cancer-related EHRs.

机构信息

Division of Information Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan.

Artificial Intelligence and Digital Twin in Healthcare, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.

出版信息

PLoS One. 2024 Sep 11;19(9):e0310432. doi: 10.1371/journal.pone.0310432. eCollection 2024.

Abstract

Real-world data (RWD) in the medical field, such as electronic health records (EHRs) and medication orders, are receiving increasing attention from researchers and practitioners. While structured data have played a vital role thus far, unstructured data represented by text (e.g., discharge summaries) are not effectively utilized because of the difficulty in extracting medical information. We evaluated the information gained by supplementing structured data with clinical concepts extracted from unstructured text by leveraging natural language processing techniques. Using a machine learning-based pretrained named entity recognition tool, we extracted disease and medication names from real discharge summaries in a Japanese hospital and linked them to medical concepts using medical term dictionaries. By comparing the diseases and medications mentioned in the text with medical codes in tabular diagnosis records, we found that: (1) the text data contained richer information on patient symptoms than tabular diagnosis records, whereas the medication-order table stored more injection data than text. In addition, (2) extractable information regarding specific diseases showed surprisingly small intersections among text, diagnosis records, and medication orders. Text data can thus be a useful supplement for RWD mining, which is further demonstrated by (3) our practical application system for drug safety evaluation, which exhaustively visualizes suspicious adverse drug effects caused by the simultaneous use of anticancer drug pairs. We conclude that proper use of textual information extraction can lead to better outcomes in medical RWD mining.

摘要

在医学领域,真实世界数据(RWD),如电子健康记录(EHR)和用药医嘱,越来越受到研究人员和从业者的关注。虽然结构化数据迄今为止发挥了至关重要的作用,但由于难以提取医学信息,以文本形式表示的非结构化数据(例如出院小结)尚未得到有效利用。我们通过利用自然语言处理技术,将从非结构化文本中提取的临床概念补充到结构化数据中,评估了由此获得的信息。我们使用基于机器学习的预训练命名实体识别工具,从日本医院的真实出院小结中提取疾病和药物名称,并使用医学术语词典将其与医学概念联系起来。通过将文本中提到的疾病和药物与表格诊断记录中的医疗代码进行比较,我们发现:(1)文本数据包含比表格诊断记录更丰富的患者症状信息,而用药医嘱表存储的注射数据则多于文本。此外,(2)可提取的特定疾病信息在文本、诊断记录和用药医嘱之间的交集非常小。因此,文本数据可以作为 RWD 挖掘的有用补充,这一点通过(3)我们用于药物安全评估的实际应用系统得到了进一步证明,该系统详尽地可视化了由于同时使用抗癌药物对导致的可疑药物不良反应。我们得出结论,正确利用文本信息提取可以在医学 RWD 挖掘中取得更好的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/87dc/11389901/7fcadbb998de/pone.0310432.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验