Suppr超能文献

癌症成像生物标志物的基础模型。

Foundation model for cancer imaging biomarkers.

作者信息

Pai Suraj, Bontempi Dennis, Hadzic Ibrahim, Prudente Vasco, Sokač Mateo, Chaunzwa Tafadzwa L, Bernatz Simon, Hosny Ahmed, Mak Raymond H, Birkbak Nicolai J, Aerts Hugo J W L

机构信息

Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Harvard Institutes of Medicine, Boston, MA USA.

Radiology and Nuclear Medicine, CARIM and GROW, Maastricht University, Maastricht, the Netherlands.

出版信息

Nat Mach Intell. 2024;6(3):354-367. doi: 10.1038/s42256-024-00807-9. Epub 2024 Mar 15.

Abstract

Foundation models in deep learning are characterized by a single large-scale model trained on vast amounts of data serving as the foundation for various downstream tasks. Foundation models are generally trained using self-supervised learning and excel in reducing the demand for training samples in downstream applications. This is especially important in medicine, where large labelled datasets are often scarce. Here, we developed a foundation model for cancer imaging biomarker discovery by training a convolutional encoder through self-supervised learning using a comprehensive dataset of 11,467 radiographic lesions. The foundation model was evaluated in distinct and clinically relevant applications of cancer imaging-based biomarkers. We found that it facilitated better and more efficient learning of imaging biomarkers and yielded task-specific models that significantly outperformed conventional supervised and other state-of-the-art pretrained implementations on downstream tasks, especially when training dataset sizes were very limited. Furthermore, the foundation model was more stable to input variations and showed strong associations with underlying biology. Our results demonstrate the tremendous potential of foundation models in discovering new imaging biomarkers that may extend to other clinical use cases and can accelerate the widespread translation of imaging biomarkers into clinical settings.

摘要

深度学习中的基础模型的特点是通过在大量数据上训练的单个大规模模型为各种下游任务提供基础。基础模型通常使用自监督学习进行训练,并且在减少下游应用中对训练样本的需求方面表现出色。这在医学领域尤为重要,因为大型标记数据集往往很稀缺。在此,我们通过使用包含11467个放射学病变的综合数据集进行自监督学习来训练卷积编码器,从而开发了一种用于癌症成像生物标志物发现的基础模型。该基础模型在基于癌症成像的生物标志物的不同且与临床相关的应用中进行了评估。我们发现它有助于更好、更高效地学习成像生物标志物,并产生特定任务模型,这些模型在下游任务上显著优于传统的监督方法和其他最先进的预训练实现,尤其是在训练数据集规模非常有限时。此外,基础模型对输入变化更稳定,并与潜在生物学表现出强烈关联。我们的结果证明了基础模型在发现新的成像生物标志物方面的巨大潜力,这些生物标志物可能扩展到其他临床用例,并可以加速成像生物标志物在临床环境中的广泛转化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b28/10957482/93f2454a7760/42256_2024_807_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验