• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

癫痫患者电子健康记录的症状学提取与基于机器学习的分类:回顾性分析

Semiology Extraction and Machine Learning-Based Classification of Electronic Health Records for Patients With Epilepsy: Retrospective Analysis.

作者信息

Xia Yilin, He Mengqiao, Basang Sijia, Sha Leihao, Huang Zijie, Jin Ling, Duan Yifei, Tang Yusha, Li Hua, Lai Wanlin, Chen Lei

机构信息

Department of Neurology, West China Hospital, Sichuan University, #37 Guoxue Alley, Wuhou District, Chengdu, China, 86 18980605819.

Sichuan Provincial Engineering Research Center of Brain-Machine Interface, and Sichuan Provincial Engineering Research Center of Neuromodulation, West China Hospital, Sichuan University, Chengdu, China.

出版信息

JMIR Med Inform. 2024 Oct 17;12:e57727. doi: 10.2196/57727.

DOI:10.2196/57727
PMID:39621862
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11501417/
Abstract

BACKGROUND

Obtaining and describing semiology efficiently and classifying seizure types correctly are crucial for the diagnosis and treatment of epilepsy. Nevertheless, there exists an inadequacy in related informatics resources and decision support tools.

OBJECTIVE

We developed a symptom entity extraction tool and an epilepsy semiology ontology (ESO) and used machine learning to achieve an automated binary classification of epilepsy in this study.

METHODS

Using present history data of electronic health records from the Southwest Epilepsy Center in China, we constructed an ESO and a symptom-entity extraction tool to extract seizure duration, seizure symptoms, and seizure frequency from the unstructured text by combining manual annotation with natural language processing techniques. In addition, we achieved automatic classification of patients in the study cohort with high accuracy based on the extracted seizure feature data using multiple machine learning methods.

RESULTS

Data included present history from 10,925 cases between 2010 and 2020. Six annotators labeled a total of 2500 texts to obtain 5844 words of semiology and construct an ESO with 702 terms. Based on the ontology, the extraction tool achieved an accuracy rate of 85% in symptom extraction. Furthermore, we trained a stacking ensemble learning model combining XGBoost and random forest with an F1-score of 75.03%. The random forest model had the highest area under the curve (0.985).

CONCLUSIONS

This work demonstrated the feasibility of natural language processing-assisted structural extraction of epilepsy medical record texts and downstream tasks, providing open ontology resources for subsequent related work.

摘要

背景

高效获取并描述癫痫发作症状学以及正确分类癫痫发作类型对于癫痫的诊断和治疗至关重要。然而,相关信息学资源和决策支持工具存在不足。

目的

本研究开发了一种症状实体提取工具和癫痫发作症状学本体(ESO),并使用机器学习实现癫痫的自动二元分类。

方法

利用中国西南癫痫中心电子健康记录的现病史数据,我们构建了一个ESO和一个症状实体提取工具,通过将人工标注与自然语言处理技术相结合,从非结构化文本中提取发作持续时间、发作症状和发作频率。此外,我们使用多种机器学习方法,基于提取的发作特征数据,在研究队列中实现了患者的自动高精度分类。

结果

数据包括2010年至2020年间10925例患者的现病史。6名标注人员对总共2500篇文本进行标注,以获取5844个症状学词汇,并构建了一个包含702个术语的ESO。基于该本体,提取工具在症状提取方面的准确率达到85%。此外,我们训练了一个结合XGBoost和随机森林的堆叠集成学习模型,F1分数为75.03%。随机森林模型的曲线下面积最高(0.985)。

结论

这项工作证明了自然语言处理辅助癫痫病历文本结构提取及下游任务的可行性,为后续相关工作提供了开放的本体资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/48c48e1c297b/medinform-v12-e57727-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/75a3f97f166a/medinform-v12-e57727-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/6cbf4b55a2da/medinform-v12-e57727-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/d87438c76926/medinform-v12-e57727-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/406854219fb5/medinform-v12-e57727-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/7b7dcc7890db/medinform-v12-e57727-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/48c48e1c297b/medinform-v12-e57727-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/75a3f97f166a/medinform-v12-e57727-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/6cbf4b55a2da/medinform-v12-e57727-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/d87438c76926/medinform-v12-e57727-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/406854219fb5/medinform-v12-e57727-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/7b7dcc7890db/medinform-v12-e57727-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/278e/11501417/48c48e1c297b/medinform-v12-e57727-g006.jpg

相似文献

1
Semiology Extraction and Machine Learning-Based Classification of Electronic Health Records for Patients With Epilepsy: Retrospective Analysis.癫痫患者电子健康记录的症状学提取与基于机器学习的分类:回顾性分析
JMIR Med Inform. 2024 Oct 17;12:e57727. doi: 10.2196/57727.
2
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.
3
Using Natural Language Processing and Machine Learning to classify the status of kidney allograft in Electronic Medical Records written in Spanish.使用自然语言处理和机器学习对西班牙语电子病历中同种异体肾移植的状态进行分类。
PLoS One. 2025 May 8;20(5):e0322587. doi: 10.1371/journal.pone.0322587. eCollection 2025.
4
Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system.利用自然语言处理从非结构化临床信件中提取结构化癫痫数据:ExECT(癫痫临床文本提取)系统的开发和验证。
BMJ Open. 2019 Apr 1;9(4):e023232. doi: 10.1136/bmjopen-2018-023232.
5
Comparison of an Ensemble of Machine Learning Models and the BERT Language Model for Analysis of Text Descriptions of Brain CT Reports to Determine the Presence of Intracranial Hemorrhage.基于机器学习模型集成与 BERT 语言模型的脑 CT 报告文本描述分析用于判断颅内出血的比较研究
Sovrem Tekhnologii Med. 2024;16(1):27-34. doi: 10.17691/stm2024.16.1.03. Epub 2024 Feb 28.
6
Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing.从癫痫诊所记录中提取癫痫发作频率:一种自然语言处理的机器阅读方法。
J Am Med Inform Assoc. 2022 Apr 13;29(5):873-881. doi: 10.1093/jamia/ocac018.
7
Task definition, annotated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes.从非结构化临床记录中提取症状的任务定义、标注数据集和监督自然语言处理模型。
J Biomed Inform. 2020 Feb;102:103354. doi: 10.1016/j.jbi.2019.103354. Epub 2019 Dec 12.
8
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
9
Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.自然语言处理和机器学习可实现从电子病历中自动提取和分类患者的吸烟状况。
Ups J Med Sci. 2020 Nov;125(4):316-324. doi: 10.1080/03009734.2020.1792010. Epub 2020 Jul 22.
10
Predicting seizure recurrence after an initial seizure-like episode from routine clinical notes using large language models: a retrospective cohort study.使用大型语言模型从常规临床记录预测首次类似癫痫发作后的癫痫复发:一项回顾性队列研究。
Lancet Digit Health. 2023 Dec;5(12):e882-e894. doi: 10.1016/S2589-7500(23)00179-6.

本文引用的文献

1
The promises and pitfalls of seizure phenomenology.发作现象学的承诺与陷阱。
Seizure. 2023 Dec;113:48-53. doi: 10.1016/j.seizure.2023.11.008. Epub 2023 Nov 11.
2
Improving Neurology Clinical Care With Natural Language Processing Tools.利用自然语言处理工具改善神经病学临床护理。
Neurology. 2023 Nov 27;101(22):1010-1018. doi: 10.1212/WNL.0000000000207853.
3
Natural language processing for identification of refractory status epilepticus in children.用于识别儿童难治性癫痫持续状态的自然语言处理
Epilepsia. 2023 Dec;64(12):3227-3237. doi: 10.1111/epi.17789. Epub 2023 Oct 16.
4
Text mining of electronic health records can validate a register-based diagnosis of epilepsy and subgroup into focal and generalized epilepsy.电子健康记录的文本挖掘可以验证基于登记的癫痫诊断,并将其细分为局灶性癫痫和全身性癫痫。
Epilepsia. 2023 Oct;64(10):2750-2760. doi: 10.1111/epi.17734. Epub 2023 Aug 19.
5
Creating rare epilepsy cohorts using keyword search in electronic health records.利用电子健康记录中的关键词搜索创建罕见癫痫队列。
Epilepsia. 2023 Oct;64(10):2738-2749. doi: 10.1111/epi.17725. Epub 2023 Aug 7.
6
Long-term epilepsy outcome dynamics revealed by natural language processing of clinic notes.基于自然语言处理的临床记录揭示的长期癫痫结局动态。
Epilepsia. 2023 Jul;64(7):1900-1909. doi: 10.1111/epi.17633. Epub 2023 May 10.
7
Identification of patients with epilepsy using automated electronic health records phenotyping.利用自动化电子健康记录表型识别癫痫患者。
Epilepsia. 2023 Jun;64(6):1472-1481. doi: 10.1111/epi.17589. Epub 2023 Apr 4.
8
Identification of patients with drug-resistant epilepsy in electronic medical record data using the Observational Medical Outcomes Partnership Common Data Model.利用观察医疗结局伙伴关系通用数据模型在电子病历数据中识别耐药性癫痫患者。
Epilepsia. 2022 Nov;63(11):2981-2993. doi: 10.1111/epi.17409. Epub 2022 Sep 30.
9
Development of a natural language processing algorithm to extract seizure types and frequencies from the electronic health record.开发一种自然语言处理算法,从电子健康记录中提取癫痫发作类型和频率。
Seizure. 2022 Oct;101:48-51. doi: 10.1016/j.seizure.2022.07.010. Epub 2022 Jul 20.
10
Extracting seizure frequency from epilepsy clinic notes: a machine reading approach to natural language processing.从癫痫诊所记录中提取癫痫发作频率:一种自然语言处理的机器阅读方法。
J Am Med Inform Assoc. 2022 Apr 13;29(5):873-881. doi: 10.1093/jamia/ocac018.