Yang Hao Yuan, Raghunathan Karthik, Widera Eric, Pantilat Steven Z, Brender Teva, Heintz Timothy A, Espejo Edie, Boscardin John, Mills Hunter, Lee Albert, Berchuck Jacob, Cobert Julien
Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
Department of Anesthesia and Perioperative Care, Duke University, Durham, NC, USA.
Sci Rep. 2025 May 18;15(1):17245. doi: 10.1038/s41598-025-01828-z.
Palliative care is known to improve quality of life in advanced cancer. Natural language processing offers insights to how documentation around palliative care in relation to metastatic cancer has changed. We analyzed inpatient clinical notes using unsupervised language models that learn how words related to metastatic cancer (e.g. "mets", "metastases") and palliative care (e.g. "palliative care", "pal care") appear relationally and change over time. We included any note from adults hospitalized at the University of California, San Francisco system. The primary outcome was how similarly terms related to metastatic cancer and palliative care appeared in notes using a mathematical approach (cosine similarity). We used word2vec to model language numerically as vectors. Relational data between vectors was captured using cosine similarity. We performed linear regression to identify changes in these relationships of terms over time. As a sensitivity analysis, we performed the same analysis per year restricted only to patients with an ICD-9/10 diagnosis code for metastatic cancer. Metastatic cancer and palliative care terms appeared in similar contexts in clinical notes each year, suggesting a close relationship in documentation. However, over time, this relationship weakened, with these terms becoming less commonly used together as measured by cosine similarities. We found similar trends when we retrained models just on patients with a diagnosis code for metastatic cancer. Text in clinical notes offers unique insights into how medical providers document palliative care in patients with advanced malignancies and how these documentation practices evolve over time.
已知姑息治疗可改善晚期癌症患者的生活质量。自然语言处理为与转移性癌症相关的姑息治疗文档的变化提供了见解。我们使用无监督语言模型分析住院临床记录,该模型学习与转移性癌症相关的词汇(如“转移灶”“转移”)和姑息治疗相关的词汇(如“姑息治疗”“姑息护理”)如何在关系上出现并随时间变化。我们纳入了加利福尼亚大学旧金山分校系统住院的成年患者的任何记录。主要结果是使用数学方法(余弦相似度)来衡量与转移性癌症和姑息治疗相关的术语在记录中出现的相似程度。我们使用word2vec将语言数值建模为向量。使用余弦相似度捕获向量之间的关系数据。我们进行线性回归以确定这些术语关系随时间的变化。作为敏感性分析,我们每年仅对患有转移性癌症ICD - 9/10诊断代码的患者进行相同分析。每年转移性癌症和姑息治疗术语在临床记录中出现在相似的语境中,表明在文档记录中有密切关系。然而,随着时间推移,这种关系减弱,通过余弦相似度衡量,这些术语一起使用的频率降低。当我们仅对患有转移性癌症诊断代码的患者重新训练模型时,我们发现了类似的趋势。临床记录中的文本为医疗服务提供者如何记录晚期恶性肿瘤患者的姑息治疗以及这些记录实践如何随时间演变提供了独特的见解。