Suppr超能文献

癌症诊断中的半监督学习。

Semi-supervised learning in cancer diagnostics.

作者信息

Eckardt Jan-Niklas, Bornhäuser Martin, Wendt Karsten, Middeke Jan Moritz

机构信息

Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany.

Else Kröner Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.

出版信息

Front Oncol. 2022 Jul 14;12:960984. doi: 10.3389/fonc.2022.960984. eCollection 2022.

Abstract

In cancer diagnostics, a considerable amount of data is acquired during routine work-up. Recently, machine learning has been used to build classifiers that are tasked with cancer detection and aid in clinical decision-making. Most of these classifiers are based on supervised learning (SL) that needs time- and cost-intensive manual labeling of samples by medical experts for model training. Semi-supervised learning (SSL), however, works with only a fraction of labeled data by including unlabeled samples for information abstraction and thus can utilize the vast discrepancy between available labeled data and overall available data in cancer diagnostics. In this review, we provide a comprehensive overview of essential functionalities and assumptions of SSL and survey key studies with regard to cancer care differentiating between image-based and non-image-based applications. We highlight current state-of-the-art models in histopathology, radiology and radiotherapy, as well as genomics. Further, we discuss potential pitfalls in SSL study design such as discrepancies in data distributions and comparison to baseline SL models, and point out future directions for SSL in oncology. We believe well-designed SSL models to strongly contribute to computer-guided diagnostics in malignant disease by overcoming current hinderances in the form of sparse labeled and abundant unlabeled data.

摘要

在癌症诊断中,常规检查会获取大量数据。近年来,机器学习已被用于构建分类器,其任务是进行癌症检测并辅助临床决策。这些分类器大多基于监督学习(SL),而监督学习需要医学专家对样本进行耗时且成本高昂的手动标注以用于模型训练。然而,半监督学习(SSL)通过纳入未标注样本进行信息提取,仅使用一小部分标注数据,因此能够利用癌症诊断中可用标注数据与总体可用数据之间的巨大差异。在本综述中,我们全面概述了SSL的基本功能和假设,并调查了关于癌症护理的关键研究,区分了基于图像和非基于图像的应用。我们重点介绍了组织病理学、放射学和放射治疗以及基因组学方面的当前最先进模型。此外,我们讨论了SSL研究设计中的潜在陷阱,如数据分布差异以及与基线SL模型的比较,并指出了肿瘤学中SSL的未来方向。我们相信,精心设计的SSL模型将通过克服当前稀疏标注和大量未标注数据形式的障碍,为恶性疾病的计算机辅助诊断做出巨大贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7284/9329803/2b3da64cae5e/fonc-12-960984-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验