• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

癫痫门诊信件的自然语言处理标注。

Annotation of epilepsy clinic letters for natural language processing.

机构信息

Swansea University Medical School, Swansea University, Swansea, Wales, UK.

Neurology Department, Swansea Bay University Health Board, Swansea, Wales, UK.

出版信息

J Biomed Semantics. 2024 Sep 15;15(1):17. doi: 10.1186/s13326-024-00316-z.

DOI:10.1186/s13326-024-00316-z
PMID:39277770
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11402197/
Abstract

BACKGROUND

Natural language processing (NLP) is increasingly being used to extract structured information from unstructured text to assist clinical decision-making and aid healthcare research. The availability of expert-annotated documents for the development and validation of NLP applications is limited. We created synthetic clinical documents to address this, and to validate the Extraction of Epilepsy Clinical Text version 2 (ExECTv2) NLP pipeline.

METHODS

We created 200 synthetic clinic letters based on hospital outpatient consultations with epilepsy specialists. The letters were double annotated by trained clinicians and researchers according to agreed guidelines. We used the annotation tool, Markup, with an epilepsy concept list based on the Unified Medical Language System ontology. All annotations were reviewed, and a gold standard set of annotations was agreed and used to validate the performance of ExECTv2.

RESULTS

The overall inter-annotator agreement (IAA) between the two sets of annotations produced a per item F1 score of 0.73. Validating ExECTv2 using the gold standard gave an overall F1 score of 0.87 per item, and 0.90 per letter.

CONCLUSION

The synthetic letters, annotations, and annotation guidelines have been made freely available. To our knowledge, this is the first publicly available set of annotated epilepsy clinic letters and guidelines that can be used for NLP researchers with minimum epilepsy knowledge. The IAA results show that clinical text annotation tasks are difficult and require a gold standard to be arranged by researcher consensus. The results for ExECTv2, our automated epilepsy NLP pipeline, extracted detailed epilepsy information from unstructured epilepsy letters with more accuracy than human annotators, further confirming the utility of NLP for clinical and research applications.

摘要

背景

自然语言处理(NLP)越来越多地被用于从非结构化文本中提取结构化信息,以辅助临床决策和帮助医疗保健研究。用于开发和验证 NLP 应用程序的专家注释文档的可用性有限。我们创建了合成临床文档来解决这个问题,并验证了癫痫临床文本提取版本 2(ExECTv2)NLP 管道。

方法

我们根据癫痫专家的医院门诊咨询创建了 200 封合成门诊信函。这些信件由经过培训的临床医生和研究人员根据商定的指南进行双重注释。我们使用基于统一医学语言系统本体的癫痫概念列表的注释工具 Markup。所有注释都进行了审查,并确定了一个黄金标准注释集,用于验证 ExECTv2 的性能。

结果

两组注释产生的总体注释者间一致性(IAA)每个项目的 F1 得分为 0.73。使用黄金标准验证 ExECTv2 的总体 F1 得分为每个项目 0.87,每个信函 0.90。

结论

合成信函、注释和注释指南已免费提供。据我们所知,这是第一个可供具有最少癫痫知识的 NLP 研究人员使用的公共可用的带注释的癫痫诊所信函和指南集。IAA 结果表明,临床文本注释任务具有挑战性,需要通过研究人员共识来安排黄金标准。我们的自动癫痫 NLP 管道 ExECTv2 从非结构化的癫痫信函中提取详细的癫痫信息的准确率高于人工注释者,进一步证实了 NLP 在临床和研究应用中的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f90b/11402197/9bbcbe67f053/13326_2024_316_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f90b/11402197/9bbcbe67f053/13326_2024_316_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f90b/11402197/9bbcbe67f053/13326_2024_316_Figa_HTML.jpg

相似文献

1
Annotation of epilepsy clinic letters for natural language processing.癫痫门诊信件的自然语言处理标注。
J Biomed Semantics. 2024 Sep 15;15(1):17. doi: 10.1186/s13326-024-00316-z.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Extending CARDIO:DE: Additional annotation guidelines and evaluation of NLP approaches for clinical applications.扩展CARDIO:DE:临床应用的附加注释指南及自然语言处理方法评估
Int J Med Inform. 2025 Nov;203:106009. doi: 10.1016/j.ijmedinf.2025.106009. Epub 2025 Jun 6.
4
PDF Entity Annotation Tool (PEAT).PDF实体注释工具(PEAT)。
J Open Source Softw. 2025 Apr 8;10(108):5336. doi: 10.21105/joss.05336.
5
The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.儿科言语和语言治疗师转写语音样本的音标转录的一致性。
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.
6
Extracting epilepsy-related information from unstructured clinic letters using large language models.使用大语言模型从非结构化临床信件中提取癫痫相关信息。
Epilepsia. 2025 Jul 10. doi: 10.1111/epi.18475.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
9
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.使用自然语言处理从阿尔茨海默病患者的临床记录中提取睡眠信息。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.
10
Development of a Natural Language Processing Model for Extracting Kidney Biopsy Pathology Diagnoses.用于提取肾活检病理诊断的自然语言处理模型的开发
Kidney Med. 2025 Jun 14;7(8):101047. doi: 10.1016/j.xkme.2025.101047. eCollection 2025 Aug.

本文引用的文献

1
Identifying epilepsy surgery referral candidates with natural language processing in an Australian context.在澳大利亚语境下,利用自然语言处理技术识别癫痫手术转诊候选人。
Epilepsia Open. 2024 Apr;9(2):635-642. doi: 10.1002/epi4.12901. Epub 2024 Jan 23.
2
Genetic influences on epilepsy outcomes: A whole-exome sequencing and health care records data linkage study.遗传性癫痫结局的影响:全外显子组测序和医疗记录数据关联研究。
Epilepsia. 2023 Nov;64(11):3099-3108. doi: 10.1111/epi.17766. Epub 2023 Sep 15.
3
Long-term epilepsy outcome dynamics revealed by natural language processing of clinic notes.
基于自然语言处理的临床记录揭示的长期癫痫结局动态。
Epilepsia. 2023 Jul;64(7):1900-1909. doi: 10.1111/epi.17633. Epub 2023 May 10.
4
Transforming epilepsy research: A systematic review on natural language processing applications.转化癫痫研究:自然语言处理应用的系统评价。
Epilepsia. 2023 Feb;64(2):292-305. doi: 10.1111/epi.17474. Epub 2022 Dec 19.
5
Development of a natural language processing algorithm to extract seizure types and frequencies from the electronic health record.开发一种自然语言处理算法,从电子健康记录中提取癫痫发作类型和频率。
Seizure. 2022 Oct;101:48-51. doi: 10.1016/j.seizure.2022.07.010. Epub 2022 Jul 20.
6
Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing.从癫痫诊所记录中提取癫痫发作频率:一种自然语言处理的机器阅读方法。
J Am Med Inform Assoc. 2022 Apr 13;29(5):873-881. doi: 10.1093/jamia/ocac018.
7
Markup: A Web-Based Annotation Tool Powered by Active Learning.标记:一种由主动学习驱动的基于网络的注释工具。
Front Digit Health. 2021 Jul 26;3:598916. doi: 10.3389/fdgth.2021.598916. eCollection 2021.
8
Extracting COVID-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework.从临床文本中提取 COVID-19 诊断和症状:一个新的带注释语料库和神经事件抽取框架。
J Biomed Inform. 2021 May;117:103761. doi: 10.1016/j.jbi.2021.103761. Epub 2021 Mar 26.
9
Clinical concept extraction: A methodology review.临床概念提取:方法学综述。
J Biomed Inform. 2020 Sep;109:103526. doi: 10.1016/j.jbi.2020.103526. Epub 2020 Aug 6.
10
Natural language processing for structuring clinical text data on depression using UK-CRIS.利用 UK-CRIS 对抑郁临床文本数据进行自然语言处理。
Evid Based Ment Health. 2020 Feb;23(1):21-26. doi: 10.1136/ebmental-2019-300134.