Suppr超能文献

利用数据增强和提示技术的大型语言模型从放射学报告推断癌症疾病反应。

Inferring cancer disease response from radiology reports using large language models with data augmentation and prompting.

机构信息

Division of Medical Oncology, National Cancer Centre Singapore, Singapore.

Duke-NUS Medical School, Singapore.

出版信息

J Am Med Inform Assoc. 2023 Sep 25;30(10):1657-1664. doi: 10.1093/jamia/ocad133.

Abstract

OBJECTIVE

To assess large language models on their ability to accurately infer cancer disease response from free-text radiology reports.

MATERIALS AND METHODS

We assembled 10 602 computed tomography reports from cancer patients seen at a single institution. All reports were classified into: no evidence of disease, partial response, stable disease, or progressive disease. We applied transformer models, a bidirectional long short-term memory model, a convolutional neural network model, and conventional machine learning methods to this task. Data augmentation using sentence permutation with consistency loss as well as prompt-based fine-tuning were used on the best-performing models. Models were validated on a hold-out test set and an external validation set based on Response Evaluation Criteria in Solid Tumors (RECIST) classifications.

RESULTS

The best-performing model was the GatorTron transformer which achieved an accuracy of 0.8916 on the test set and 0.8919 on the RECIST validation set. Data augmentation further improved the accuracy to 0.8976. Prompt-based fine-tuning did not further improve accuracy but was able to reduce the number of training reports to 500 while still achieving good performance.

DISCUSSION

These models could be used by researchers to derive progression-free survival in large datasets. It may also serve as a decision support tool by providing clinicians an automated second opinion of disease response.

CONCLUSIONS

Large clinical language models demonstrate potential to infer cancer disease response from radiology reports at scale. Data augmentation techniques are useful to further improve performance. Prompt-based fine-tuning can significantly reduce the size of the training dataset.

摘要

目的

评估大语言模型从放射学报告的自由文本中准确推断癌症疾病反应的能力。

材料与方法

我们汇集了一家机构的 10602 例癌症患者的计算机断层扫描报告。所有报告均分为:无疾病证据、部分缓解、疾病稳定或疾病进展。我们将变压器模型、双向长短时记忆模型、卷积神经网络模型和传统机器学习方法应用于该任务。对表现最佳的模型应用基于句子排列的一致性损失数据增强和基于提示的微调。模型在保留测试集和基于实体瘤反应评价标准(RECIST)分类的外部验证集上进行验证。

结果

表现最佳的模型是 GatorTron 变压器,在测试集上的准确率为 0.8916,在 RECIST 验证集上的准确率为 0.8919。数据增强进一步提高了准确率至 0.8976。基于提示的微调虽然不能进一步提高准确率,但可以将训练报告的数量减少到 500 份,同时仍能取得良好的效果。

讨论

这些模型可被研究人员用于从大型数据集推导出无进展生存期。它也可以作为一种决策支持工具,为临床医生提供疾病反应的自动辅助诊断。

结论

大型临床语言模型具有从放射学报告中推断癌症疾病反应的潜力。数据增强技术有助于进一步提高性能。基于提示的微调可以显著减少训练数据集的大小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05d2/10531105/b8211491a849/ocad133f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验