Suppr超能文献

利用基于Transformer的语言模型和多任务学习对[F]氟脱氧葡萄糖PET-CT肺癌报告进行不确定性感知自动TNM分期分类。

Uncertainty-aware automatic TNM staging classification for [F] Fluorodeoxyglucose PET-CT reports for lung cancer utilising transformer-based language models and multi-task learning.

作者信息

Barlow Stephen H, Chicklore Sugama, He Yulan, Ourselin Sebastien, Wagner Thomas, Barnes Anna, Cook Gary J R

机构信息

School of Biomedical Engineering and Imaging Sciences, King's College London, London, UK.

King's College London and Guy's and St. Thomas' PET Centre, St. Thomas' Hospital, London, UK.

出版信息

BMC Med Inform Decis Mak. 2024 Dec 18;24(1):396. doi: 10.1186/s12911-024-02814-7.

Abstract

BACKGROUND

[F] Fluorodeoxyglucose (FDG) PET-CT is a clinical imaging modality widely used in diagnosing and staging lung cancer. The clinical findings of PET-CT studies are contained within free text reports, which can currently only be categorised by experts manually reading them. Pre-trained transformer-based language models (PLMs) have shown success in extracting complex linguistic features from text. Accordingly, we developed a multi-task 'TNMu' classifier to classify the presence/absence of tumour, node, metastasis ('TNM') findings (as defined by The Eight Edition of TNM Staging for Lung Cancer). This is combined with an uncertainty classification task ('u') to account for studies with ambiguous TNM status.

METHODS

2498 reports were annotated by a nuclear medicine physician and split into train, validation, and test datasets. For additional evaluation an external dataset (n = 461 reports) was created, and annotated by two nuclear medicine physicians with agreement reached on all examples. We trained and evaluated eleven publicly available PLMs to determine which is most effective for PET-CT reports, and compared multi-task, single task and traditional machine learning approaches.

RESULTS

We find that a multi-task approach with GatorTron as PLM achieves the best performance, with an overall accuracy (all four tasks correct) of 84% and a Hamming loss of 0.05 on the internal test dataset, and 79% and 0.07 on the external test dataset. Performance on the individual TNM tasks approached expert performance with macro average F1 scores of 0.91, 0.95 and 0.90 respectively on external data. For uncertainty an F1 of 0.77 is achieved.

CONCLUSIONS

Our 'TNMu' classifier successfully extracts TNM staging information from internal and external PET-CT reports. We concluded that multi-task approaches result in the best performance, and better computational efficiency over single task PLM approaches. We believe these models can improve PET-CT services by assisting in auditing, creating research cohorts, and developing decision support systems. Our approach to handling uncertainty represents a novel first step but has room for further refinement.

摘要

背景

[F]氟脱氧葡萄糖(FDG)PET-CT是一种广泛应用于肺癌诊断和分期的临床成像方式。PET-CT研究的临床结果包含在自由文本报告中,目前只能由专家通过人工阅读进行分类。基于预训练变压器的语言模型(PLM)已成功从文本中提取复杂的语言特征。因此,我们开发了一种多任务“TNMu”分类器,用于对肿瘤、淋巴结、转移(“TNM”)结果(如《肺癌TNM分期第八版》所定义)的存在与否进行分类。这与不确定性分类任务(“u”)相结合,以处理TNM状态不明确的研究。

方法

由一名核医学医师对2498份报告进行注释,并将其分为训练集、验证集和测试集。为了进行额外评估,创建了一个外部数据集(n = 461份报告),并由两名核医学医师进行注释,所有示例均达成一致。我们训练并评估了11个公开可用的PLM,以确定哪个对PET-CT报告最有效,并比较了多任务、单任务和传统机器学习方法。

结果

我们发现,以GatorTron作为PLM的多任务方法性能最佳,在内部测试数据集上的总体准确率(所有四项任务均正确)为84%,汉明损失为0.05,在外部测试数据集上为79%和0.07。在各个TNM任务上的性能接近专家水平,在外部数据上的宏观平均F1分数分别为0.91、0.95和0.90。对于不确定性,F1分数为0.77。

结论

我们的“TNMu”分类器成功地从内部和外部PET-CT报告中提取了TNM分期信息。我们得出结论,多任务方法性能最佳,且比单任务PLM方法具有更高的计算效率。我们相信这些模型可以通过协助审核、创建研究队列和开发决策支持系统来改善PET-CT服务。我们处理不确定性的方法代表了新颖的第一步,但仍有进一步完善的空间。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验