• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习方法从电子健康记录中提取乳腺癌症状的文档。

Machine Learning Methods to Extract Documentation of Breast Cancer Symptoms From Electronic Health Records.

机构信息

Department of Electrical Engineering and Computer Science, CSAIL, MIT, Cambridge, Massachusetts.

Department of Surgical Oncology, Massachusetts General Hospital, Boston, Massachusetts.

出版信息

J Pain Symptom Manage. 2018 Jun;55(6):1492-1499. doi: 10.1016/j.jpainsymman.2018.02.016. Epub 2018 Feb 27.

DOI:10.1016/j.jpainsymman.2018.02.016
PMID:29496537
Abstract

CONTEXT

Clinicians document cancer patients' symptoms in free-text format within electronic health record visit notes. Although symptoms are critically important to quality of life and often herald clinical status changes, computational methods to assess the trajectory of symptoms over time are woefully underdeveloped.

OBJECTIVES

To create machine learning algorithms capable of extracting patient-reported symptoms from free-text electronic health record notes.

METHODS

The data set included 103,564 sentences obtained from the electronic clinical notes of 2695 breast cancer patients receiving paclitaxel-containing chemotherapy at two academic cancer centers between May 1996 and May 2015. We manually annotated 10,000 sentences and trained a conditional random field model to predict words indicating an active symptom (positive label), absence of a symptom (negative label), or no symptom at all (neutral label). Sentences labeled by human coder were divided into training, validation, and test data sets. Final model performance was determined on 20% test data unused in model development or tuning.

RESULTS

The final model achieved precision of 0.82, 0.86, and 0.99 and recall of 0.56, 0.69, and 1.00 for positive, negative, and neutral symptom labels, respectively. The most common positive symptoms were pain, fatigue, and nausea. Machine-based labeling of 103,564 sentences took two minutes.

CONCLUSION

We demonstrate the potential of machine learning to gather, track, and analyze symptoms experienced by cancer patients during chemotherapy. Although our initial model requires further optimization to improve the performance, further model building may yield machine learning methods suitable to be deployed in routine clinical care, quality improvement, and research applications.

摘要

背景

临床医生以电子病历就诊记录中的自由文本格式记录癌症患者的症状。尽管症状对生活质量至关重要,并且常常预示着临床状况的变化,但评估症状随时间推移的轨迹的计算方法却非常不完善。

目的

创建能够从电子病历自由文本记录中提取患者报告症状的机器学习算法。

方法

该数据集包含来自 2015 年 5 月至 2015 年 5 月期间在两个学术癌症中心接受紫杉醇类化疗的 2695 例乳腺癌患者的电子临床记录中的 103564 个句子。我们手动注释了 10000 个句子,并训练了一个条件随机场模型来预测表示活跃症状(阳性标签)、无症状(阴性标签)或根本无症状(中性标签)的单词。由人类编码员标记的句子分为训练、验证和测试数据集。最终模型性能是在未用于模型开发或调整的 20%测试数据上确定的。

结果

最终模型在阳性、阴性和中性症状标签上的精度分别为 0.82、0.86 和 0.99,召回率分别为 0.56、0.69 和 1.00。最常见的阳性症状是疼痛、疲劳和恶心。对 103564 个句子进行基于机器的标记需要两分钟。

结论

我们证明了机器学习在收集、跟踪和分析癌症患者化疗期间经历的症状方面的潜力。虽然我们的初始模型需要进一步优化以提高性能,但进一步的模型构建可能会产生适合在常规临床护理、质量改进和研究应用中部署的机器学习方法。

相似文献

1
Machine Learning Methods to Extract Documentation of Breast Cancer Symptoms From Electronic Health Records.机器学习方法从电子健康记录中提取乳腺癌症状的文档。
J Pain Symptom Manage. 2018 Jun;55(6):1492-1499. doi: 10.1016/j.jpainsymman.2018.02.016. Epub 2018 Feb 27.
2
Strategies to Address the Lack of Labeled Data for Supervised Machine Learning Training With Electronic Health Records: Case Study for the Extraction of Symptoms From Clinical Notes.应对电子健康记录监督式机器学习训练中标记数据不足的策略:从临床笔记中提取症状的案例研究
JMIR Med Inform. 2022 Mar 14;10(3):e32903. doi: 10.2196/32903.
3
Machine learning and natural language processing (NLP) approach to predict early progression to first-line treatment in real-world hormone receptor-positive (HR+)/HER2-negative advanced breast cancer patients.机器学习和自然语言处理(NLP)方法预测激素受体阳性(HR+)/HER2 阴性晚期乳腺癌患者一线治疗的早期进展。
Eur J Cancer. 2021 Feb;144:224-231. doi: 10.1016/j.ejca.2020.11.030. Epub 2020 Dec 26.
4
Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives.自然语言处理能准确区分电子健康记录中的癌症症状信息。
JCO Clin Cancer Inform. 2024 Aug;8:e2300235. doi: 10.1200/CCI.23.00235.
5
Identification of Uncontrolled Symptoms in Cancer Patients Using Natural Language Processing.利用自然语言处理识别癌症患者的未控症状。
J Pain Symptom Manage. 2022 Apr;63(4):610-617. doi: 10.1016/j.jpainsymman.2021.10.014. Epub 2021 Nov 4.
6
Using natural language processing and machine learning to identify breast cancer local recurrence.利用自然语言处理和机器学习识别乳腺癌局部复发。
BMC Bioinformatics. 2018 Dec 28;19(Suppl 17):498. doi: 10.1186/s12859-018-2466-x.
7
Identifying Goals of Care Conversations in the Electronic Health Record Using Natural Language Processing and Machine Learning.使用自然语言处理和机器学习在电子健康记录中识别护理谈话目标
J Pain Symptom Manage. 2021 Jan;61(1):136-142.e2. doi: 10.1016/j.jpainsymman.2020.08.024. Epub 2020 Aug 25.
8
Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records.自然语言处理和机器学习可实现从电子病历中自动提取和分类患者的吸烟状况。
Ups J Med Sci. 2020 Nov;125(4):316-324. doi: 10.1080/03009734.2020.1792010. Epub 2020 Jul 22.
9
Machine learning models to detect social distress, spiritual pain, and severe physical psychological symptoms in terminally ill patients with cancer from unstructured text data in electronic medical records.从电子病历中的非结构化文本数据中,使用机器学习模型来检测癌症晚期患者的社会困境、精神痛苦和严重的身心症状。
Palliat Med. 2022 Sep;36(8):1207-1216. doi: 10.1177/02692163221105595. Epub 2022 Jun 30.
10
Detecting rare diseases in electronic health records using machine learning and knowledge engineering: Case study of acute hepatic porphyria.使用机器学习和知识工程在电子健康记录中检测罕见病:急性肝卟啉症案例研究。
PLoS One. 2020 Jul 2;15(7):e0235574. doi: 10.1371/journal.pone.0235574. eCollection 2020.

引用本文的文献

1
High-Throughput Phenotyping of the Symptoms of Alzheimer Disease and Related Dementias Using Large Language Models: Cross-Sectional Study.使用大语言模型对阿尔茨海默病及相关痴呆症症状进行高通量表型分析:横断面研究
JMIR AI. 2025 Jun 3;4:e66926. doi: 10.2196/66926.
2
Extraction of Normalized Symptom Mentions From Clinical Narratives Using Large Language Models.使用大语言模型从临床叙述中提取标准化症状提及
AMIA Annu Symp Proc. 2025 May 22;2024:600-609. eCollection 2024.
3
Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.
利用人工智能改善临床文档记录:一项系统综述。
Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer-Fall.
4
Qualitative Health-Related Quality of Life and Natural Language Processing: Characteristics, Implications, and Challenges.定性健康相关生活质量与自然语言处理:特征、影响及挑战
Healthcare (Basel). 2024 Oct 8;12(19):2008. doi: 10.3390/healthcare12192008.
5
Automated Extraction of Patient-Centered Outcomes After Breast Cancer Treatment: An Open-Source Large Language Model-Based Toolkit.基于开源大语言模型的乳腺癌治疗后患者为中心结局自动提取工具包。
JCO Clin Cancer Inform. 2024 Aug;8:e2300258. doi: 10.1200/CCI.23.00258.
6
Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review.利用自然语言处理分析从癌症患者电子健康记录中获取的非结构化患者报告结局数据:一项系统综述。
Expert Rev Pharmacoecon Outcomes Res. 2024 Apr;24(4):467-475. doi: 10.1080/14737167.2024.2322664. Epub 2024 Mar 5.
7
21st century (clinical) decision support in nursing and allied healthcare. Developing a learning health system: a reasoned design of a theoretical framework.21 世纪护理及相关医疗保健领域的(临床)决策支持。开发学习型卫生系统:理论框架的合理设计。
BMC Med Inform Decis Mak. 2023 Dec 5;23(1):279. doi: 10.1186/s12911-023-02372-4.
8
Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review.使用机器学习方法进行自然语言处理,以分析来自电子健康记录的非结构化患者报告结局:系统评价。
Artif Intell Med. 2023 Dec;146:102701. doi: 10.1016/j.artmed.2023.102701. Epub 2023 Nov 1.
9
Approach to machine learning for extraction of real-world data variables from electronic health records.从电子健康记录中提取真实世界数据变量的机器学习方法。
Front Pharmacol. 2023 Sep 15;14:1180962. doi: 10.3389/fphar.2023.1180962. eCollection 2023.
10
Applications of Machine Learning in Palliative Care: A Systematic Review.机器学习在姑息治疗中的应用:一项系统综述
Cancers (Basel). 2023 Mar 4;15(5):1596. doi: 10.3390/cancers15051596.