• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用自然语言处理技术预测甲状腺癌患者与健康相关的生活质量变化

Predicting health-related quality of life change using natural language processing in thyroid cancer.

作者信息

Lian Ruixue, Hsiao Vivian, Hwang Juwon, Ou Yue, Robbins Sarah E, Connor Nadine P, Macdonald Cameron L, Sippel Rebecca S, Sethares William A, Schneider David F

机构信息

University of Wisconsin, Madison, USA.

University of Wisconsin, Madison Department of Electrical and Computer Engineering, USA.

出版信息

Intell Based Med. 2023;7. doi: 10.1016/j.ibmed.2023.100097. Epub 2023 Mar 15.

DOI:10.1016/j.ibmed.2023.100097
PMID:37664403
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10473865/
Abstract

BACKGROUND

Patient-reported outcomes (PRO) allow clinicians to measure health-related quality of life (HRQOL) and understand patients' treatment priorities, but obtaining PRO requires surveys which are not part of routine care. We aimed to develop a preliminary natural language processing (NLP) pipeline to extract HRQOL trajectory based on deep learning models using patient language.

MATERIALS AND METHODS

Our data consisted of transcribed interviews of 100 patients undergoing surgical intervention for low-risk thyroid cancer, paired with HRQOL assessments completed during the same visits. Our outcome measure was HRQOL trajectory measured by the SF-12 physical and mental component scores (PCS and MCS), and average THYCA-QoL score.We constructed an NLP pipeline based on BERT, a modern deep language model that captures context semantics, to predict HRQOL trajectory as measured by the above endpoints. We compared this to baseline models using logistic regression and support vector machines trained on bag-of-words representations of transcripts obtained using Linguistic Inquiry and Word Count (LIWC). Finally, given the modest dataset size, we implemented two data augmentation methods to improve performance: first by generating synthetic samples via GPT-2, and second by changing the representation of available data via sequence-by-sequence pairing, which is a novel approach.

RESULTS

A BERT-based deep learning model, with GPT-2 synthetic sample augmentation, demonstrated an area-under-curve of 76.3% in the classification of HRQOL accuracy as measured by PCS, compared to the baseline logistic regression and bag-of-words model, which had an AUC of 59.9%. The sequence-by-sequence pairing method for augmentation had an AUC of 71.2% when used with the BERT model.

CONCLUSIONS

NLP methods show promise in extracting PRO from unstructured narrative data, and in the future may aid in assessing and forecasting patients' HRQOL in response to medical treatments. Our experiments with optimization methods suggest larger amounts of novel data would further improve performance of the classification model.

摘要

背景

患者报告结局(PRO)使临床医生能够衡量健康相关生活质量(HRQOL)并了解患者的治疗优先级,但获取PRO需要进行并非常规护理一部分的调查。我们旨在开发一个初步的自然语言处理(NLP)管道,以基于深度学习模型使用患者语言提取HRQOL轨迹。

材料与方法

我们的数据包括对100例接受低风险甲状腺癌手术干预患者的访谈记录,以及在同一次就诊期间完成的HRQOL评估。我们的结局指标是通过SF-12身体和心理成分得分(PCS和MCS)以及平均THYCA-QoL得分衡量的HRQOL轨迹。我们基于BERT构建了一个NLP管道,BERT是一种捕捉上下文语义的现代深度语言模型,用于预测由上述终点衡量的HRQOL轨迹。我们将其与使用逻辑回归和支持向量机的基线模型进行比较,这些基线模型是在使用语言查询和字数统计(LIWC)获得的转录本的词袋表示上进行训练的。最后,鉴于数据集规模较小,我们实施了两种数据增强方法来提高性能:第一种是通过GPT-2生成合成样本,第二种是通过逐序列配对改变可用数据的表示,这是一种新颖的方法。

结果

与基线逻辑回归和词袋模型相比,基于BERT的深度学习模型在通过PCS衡量的HRQOL准确性分类中,经GPT-2合成样本增强后,曲线下面积为76.3%,而基线逻辑回归和词袋模型的AUC为59.9%。当与BERT模型一起使用时,逐序列配对增强方法的AUC为71.2%。

结论

NLP方法在从非结构化叙述数据中提取PRO方面显示出前景,并且未来可能有助于评估和预测患者对医疗治疗的HRQOL。我们的优化方法实验表明,大量的新数据将进一步提高分类模型的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/f4379aa84a51/nihms-1909853-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/69d2b11a2b80/nihms-1909853-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/36400018c5e3/nihms-1909853-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/f4379aa84a51/nihms-1909853-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/69d2b11a2b80/nihms-1909853-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/36400018c5e3/nihms-1909853-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d9e6/10473865/f4379aa84a51/nihms-1909853-f0003.jpg

相似文献

1
Predicting health-related quality of life change using natural language processing in thyroid cancer.利用自然语言处理技术预测甲状腺癌患者与健康相关的生活质量变化
Intell Based Med. 2023;7. doi: 10.1016/j.ibmed.2023.100097. Epub 2023 Mar 15.
2
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.
3
Social Reminiscence in Older Adults' Everyday Conversations: Automated Detection Using Natural Language Processing and Machine Learning.老年人日常对话中的社会怀旧:使用自然语言处理和机器学习的自动检测。
J Med Internet Res. 2020 Sep 15;22(9):e19133. doi: 10.2196/19133.
4
When BERT meets Bilbo: a learning curve analysis of pretrained language model on disease classification.当 BERT 遇见比尔博:预训练语言模型在疾病分类上的学习曲线分析。
BMC Med Inform Decis Mak. 2022 Apr 5;21(Suppl 9):377. doi: 10.1186/s12911-022-01829-2.
5
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
6
Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).使用基于转换器的双向编码器表示 (BERT) 和领域内预训练 (IDPT) 对耳鸣患者的可操作放射学报告进行自动文本分类。
BMC Med Inform Decis Mak. 2022 Jul 30;22(1):200. doi: 10.1186/s12911-022-01946-y.
7
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.
8
Natural Language Processing and Machine Learning Methods to Characterize Unstructured Patient-Reported Outcomes: Validation Study.自然语言处理和机器学习方法分析非结构化患者报告结局:验证研究。
J Med Internet Res. 2021 Nov 3;23(11):e26777. doi: 10.2196/26777.
9
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
10
Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI.使用基于自然语言处理的脑磁共振成像放射学报告机器学习预测卒中结局
J Pers Med. 2020 Dec 16;10(4):286. doi: 10.3390/jpm10040286.

引用本文的文献

1
Clinical applications of large language models in medicine and surgery: A scoping review.大型语言模型在医学与外科中的临床应用:一项范围综述
J Int Med Res. 2025 Jul;53(7):3000605251347556. doi: 10.1177/03000605251347556. Epub 2025 Jul 4.
2
Applications of Natural Language Processing in Otolaryngology: A Scoping Review.自然语言处理在耳鼻咽喉科的应用:一项范围综述
Laryngoscope. 2025 Sep;135(9):3049-3063. doi: 10.1002/lary.32198. Epub 2025 May 1.
3
A Narrative Review on the Application of Large Language Models to Support Cancer Care and Research.

本文引用的文献

1
A Randomized Controlled Clinical Trial: No Clear Benefit to Prophylactic Central Neck Dissection in Patients With Clinically Node Negative Papillary Thyroid Cancer.一项随机对照临床试验:临床淋巴结阴性甲状腺乳头状癌患者预防性中央颈部淋巴结清扫术无明显获益。
Ann Surg. 2020 Sep 1;272(3):496-503. doi: 10.1097/SLA.0000000000004345.
2
How do you feel? Using natural language processing to automatically rate emotion in psychotherapy.你感觉如何?使用自然语言处理自动评估心理治疗中的情绪。
Behav Res Methods. 2021 Oct;53(5):2069-2082. doi: 10.3758/s13428-020-01531-z. Epub 2021 Mar 22.
3
Patient-Reported Quality-of-Life Outcome Measures in the Thyroid Cancer Population.
关于应用大语言模型支持癌症护理与研究的叙述性综述。
Yearb Med Inform. 2024 Aug;33(1):90-98. doi: 10.1055/s-0044-1800726. Epub 2025 Apr 8.
4
Large language models in cancer: potentials, risks, and safeguards.癌症领域的大语言模型:潜力、风险与保障措施
BJR Artif Intell. 2024 Dec 20;2(1):ubae019. doi: 10.1093/bjrai/ubae019. eCollection 2025 Jan.
5
Predicting patient reported outcome measures: a scoping review for the artificial intelligence-guided patient preference predictor.预测患者报告的结局指标:人工智能引导的患者偏好预测器的范围综述
Front Artif Intell. 2024 Nov 5;7:1477447. doi: 10.3389/frai.2024.1477447. eCollection 2024.
6
A Systematic Review of Natural Language Processing Methods and Applications in Thyroidology.甲状腺学中自然语言处理方法与应用的系统评价
Mayo Clin Proc Digit Health. 2024 Jun;2(2):270-279. doi: 10.1016/j.mcpdig.2024.03.007. Epub 2024 May 21.
7
Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records.通过电子健康记录的自然语言处理进行甲状腺超声检查适宜性识别
Mayo Clin Proc Digit Health. 2024 Mar;2(1):67-74. doi: 10.1016/j.mcpdig.2024.01.001. Epub 2024 Feb 1.
甲状腺癌患者的报告生命质量结局测量指标。
Thyroid. 2020 Oct;30(10):1414-1431. doi: 10.1089/thy.2020.0038. Epub 2020 May 14.
4
Trajectories of health-related quality of life in breast cancer patients.乳腺癌患者健康相关生活质量的轨迹。
Support Care Cancer. 2020 Jul;28(7):3381-3389. doi: 10.1007/s00520-019-05184-3. Epub 2019 Nov 26.
5
A Qualitative Analysis of the Preoperative Needs of Patients With Papillary Thyroid Cancer.甲状腺乳头状癌患者术前需求的定性分析。
J Surg Res. 2019 Dec;244:324-331. doi: 10.1016/j.jss.2019.06.072. Epub 2019 Jul 12.
6
Reliability and validity of SF-12v2 among adults with self-reported cancer.自报患有癌症的成年人中 SF-12v2 的信度和效度。
Res Social Adm Pharm. 2018 Nov;14(11):1080-1084. doi: 10.1016/j.sapharm.2018.01.007. Epub 2018 Jan 31.
7
Overall Survival Results of a Trial Assessing Patient-Reported Outcomes for Symptom Monitoring During Routine Cancer Treatment.一项评估常规癌症治疗期间症状监测的患者报告结局的试验的总生存结果。
JAMA. 2017 Jul 11;318(2):197-198. doi: 10.1001/jama.2017.7156.
8
Papillary Thyroid Cancer: The Good and Bad of the "Good Cancer".甲状腺乳头状癌:“好癌症”的利弊
Thyroid. 2017 Jul;27(7):902-907. doi: 10.1089/thy.2016.0632. Epub 2017 Jun 12.
9
Patient reported outcome measures in practice.实践中的患者报告结局指标
BMJ. 2015 Feb 10;350:g7818. doi: 10.1136/bmj.g7818.
10
Use of sentiment analysis for capturing patient experience from free-text comments posted online.使用情感分析从在线发布的自由文本评论中获取患者体验。
J Med Internet Res. 2013 Nov 1;15(11):e239. doi: 10.2196/jmir.2721.