Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA.
Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Nat Commun. 2022 Nov 2;13(1):6572. doi: 10.1038/s41467-022-34025-x.
A model's ability to express its own predictive uncertainty is an essential attribute for maintaining clinical user confidence as computational biomarkers are deployed into real-world medical settings. In the domain of cancer digital histopathology, we describe a clinically-oriented approach to uncertainty quantification for whole-slide images, estimating uncertainty using dropout and calculating thresholds on training data to establish cutoffs for low- and high-confidence predictions. We train models to identify lung adenocarcinoma vs. squamous cell carcinoma and show that high-confidence predictions outperform predictions without uncertainty, in both cross-validation and testing on two large external datasets spanning multiple institutions. Our testing strategy closely approximates real-world application, with predictions generated on unsupervised, unannotated slides using predetermined thresholds. Furthermore, we show that uncertainty thresholding remains reliable in the setting of domain shift, with accurate high-confidence predictions of adenocarcinoma vs. squamous cell carcinoma for out-of-distribution, non-lung cancer cohorts.
模型表达自身预测不确定性的能力对于在计算生物标志物被部署到实际医疗环境中时维持临床用户信心是至关重要的。在癌症数字组织病理学领域,我们描述了一种针对全切片图像不确定性量化的临床导向方法,使用随机失活来估计不确定性,并在训练数据上计算阈值以建立用于低置信度和高置信度预测的截止值。我们训练模型以识别肺腺癌与鳞状细胞癌,并表明在两个大型外部数据集的交叉验证和测试中,高置信度预测的表现优于没有不确定性的预测。我们的测试策略非常接近实际应用,使用预定的阈值在无监督、无注释的幻灯片上生成预测。此外,我们表明,在域转移的情况下,不确定性阈值仍然可靠,对于非肺部癌症队列的离群、非肺癌数据集,能够进行准确的高置信度肺腺癌与鳞状细胞癌预测。