Zhu Libing, Rong Yi, McGee Lisa A, Rwigema Jean-Claude M, Patel Samir H
Department of Radiation Oncology, Mayo Clinic, Phoenix, AZ 85054, USA.
Cancers (Basel). 2024 Jun 24;16(13):2311. doi: 10.3390/cancers16132311.
This study aimed to develop a retrained large language model (LLM) tailored to the needs of HN cancer patients treated with radiotherapy, with emphasis on symptom management and survivorship care.
A comprehensive external database was curated for training ChatGPT-4, integrating expert-identified consensus guidelines on supportive care for HN patients and correspondences from physicians and nurses within our institution's electronic medical records for 90 HN patients. The performance of our model was evaluated using 20 patient post-treatment inquiries that were then assessed by three Board certified radiation oncologists (RadOncs). The rating of the model was assessed on a scale of 1 (strongly disagree) to 5 (strongly agree) based on accuracy, clarity of response, completeness s, and relevance.
The average scores for the 20 tested questions were 4.25 for accuracy, 4.35 for clarity, 4.22 for completeness, and 4.32 for relevance, on a 5-point scale. Overall, 91.67% (220 out of 240) of assessments received scores of 3 or higher, and 83.33% (200 out of 240) received scores of 4 or higher.
The custom-trained model demonstrates high accuracy in providing support to HN patients offering evidence-based information and guidance on their symptom management and survivorship care.
本研究旨在开发一种经过重新训练的大型语言模型(LLM),以满足接受放射治疗的头颈部癌症(HN)患者的需求,重点是症状管理和生存护理。
精心策划了一个全面的外部数据库来训练ChatGPT-4,该数据库整合了专家确定的关于HN患者支持性护理的共识指南,以及来自我们机构电子病历中90名HN患者的医生和护士的通信记录。使用20个患者治疗后咨询评估我们模型的性能,然后由三名获得董事会认证的放射肿瘤学家(RadOncs)进行评估。根据准确性、回答清晰度、完整性和相关性,对模型的评分采用1(强烈不同意)至5(强烈同意)的量表进行评估。
在5分制中,20个测试问题的平均得分分别为:准确性4.25分、清晰度4.35分、完整性4.22分、相关性4.32分。总体而言,91.67%(240个中的220个)的评估得分在3分或更高,83.33%(240个中的200个)的评估得分在4分或更高。
定制训练的模型在为HN患者提供支持方面表现出高准确性,为他们的症状管理和生存护理提供基于证据的信息和指导。