文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score.

作者信息

Kwon Sunjae, Yao Zonghai, Jordan Harmon S, Levy David A, Corner Brian, Yu Hong

机构信息

UMass Amherst.

Health Research Consultant.

出版信息

Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022:11733-11751.


DOI:
PMID:37103473
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10129059/
Abstract

This paper proposes a new natural language processing (NLP) application for identifying medical jargon terms potentially difficult for patients to comprehend from electronic health record (EHR) notes. We first present a novel and publicly available dataset with expert-annotated medical jargon terms from 18K+ EHR note sentences (). Then, we introduce a novel medical jargon extraction () model which has been shown to outperform existing state-of-the-art NLP models. First, MedJEx improved the overall performance when it was trained on an auxiliary Wikipedia hyperlink span dataset, where hyperlink spans provide additional Wikipedia articles to explain the spans (or terms), and then fine-tuned on the annotated MedJ data. Secondly, we found that a contextualized masked language model score was beneficial for detecting domain-specific unfamiliar jargon terms. Moreover, our results show that training on the auxiliary Wikipedia hyperlink span datasets improved six out of eight biomedical named entity recognition benchmark datasets. Both MedJ and MedJEx are publicly available.

摘要

相似文献

[1]
MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score.

Proc Conf Empir Methods Nat Lang Process. 2022-12

[2]
A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018-9-12

[3]
Finding Important Terms for Patients in Their Electronic Health Records: A Learning-to-Rank Approach Using Expert Annotations.

JMIR Med Inform. 2016-11-30

[4]
Unsupervised ensemble ranking of terms in electronic health record notes based on their importance to patients.

J Biomed Inform. 2017-4

[5]
Evaluating Expert-Layperson Agreement in Identifying Jargon Terms in Electronic Health Record Notes: Observational Study.

J Med Internet Res. 2024-10-15

[6]
Benchmarking for biomedical natural language processing tasks with a domain specific ALBERT.

BMC Bioinformatics. 2022-4-21

[7]
Ranking Medical Terms to Support Expansion of Lay Language Resources for Patient Comprehension of Electronic Health Record Notes: Adapted Distant Supervision Approach.

JMIR Med Inform. 2017-10-31

[8]
Contextualized medication event extraction with striding NER and multi-turn QA.

J Biomed Inform. 2023-8

[9]
Task definition, annotated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes.

J Biomed Inform. 2020-2

[10]
Biomedical and clinical English model packages for the Stanza Python NLP library.

J Am Med Inform Assoc. 2021-8-13

引用本文的文献

[1]
MedReadCtrl: Personalizing medical text generation with readability-controlled instruction learning.

medRxiv. 2025-7-11

[2]
MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain.

Proc Conf Empir Methods Nat Lang Process. 2024-11

[3]
ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection.

Proc Conf. 2024-6

[4]
Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing.

AMIA Jt Summits Transl Sci Proc. 2023-6-16

[5]
Automated identification of eviction status from electronic health record notes.

J Am Med Inform Assoc. 2023-7-19

本文引用的文献

[1]
SPARClink: an interactive tool to visualize the impact of the SPARC program.

F1000Res. 2022

[2]
Multi-domain clinical natural language processing with MedCAT: The Medical Concept Annotation Toolkit.

Artif Intell Med. 2021-7

[3]
Evaluating the Effectiveness of NoteAid in a Community Hospital Setting: Randomized Trial of Electronic Health Record Note Comprehension Interventions With Patients.

J Med Internet Res. 2021-5-13

[4]
Self-Diagnosis through AI-enabled Chatbot-based Symptom Checkers: User Experiences and Design Considerations.

AMIA Annu Symp Proc. 2020

[5]
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Bioinformatics. 2020-2-15

[6]
Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0).

Drug Saf. 2019-1

[7]
Training to Improve Communication Quality: An Efficient Interdisciplinary Experience for Emergency Department Clinicians.

Am J Med Qual. 2019

[8]
A Natural Language Processing System That Links Medical Terms in Electronic Health Record Notes to Lay Definitions: System Development Using Physician Reviews.

J Med Internet Res. 2018-1-22

[9]
Text Simplification Using Consumer Health Vocabulary to Generate Patient-Centered Radiology Reporting: Translation and Evaluation.

J Med Internet Res. 2017-12-18

[10]
Ranking Medical Terms to Support Expansion of Lay Language Resources for Patient Comprehension of Electronic Health Record Notes: Adapted Distant Supervision Approach.

JMIR Med Inform. 2017-10-31

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索