Viriyasaranon Thanaporn, Chun Jung Won, Koh Young Hwan, Cho Jae Hee, Jung Min Kyu, Kim Seong-Hun, Kim Hyo Jung, Lee Woo Jin, Choi Jang-Hwan, Woo Sang Myung
Graduate Program in System Health Science and Engineering, Division of Mechanical and Biomedical Engineering, Ewha Womans University, Seoul 03760, Republic of Korea.
Center for Liver and Pancreatobiliary Cancer, National Cancer Center, Goyang 10408, Republic of Korea.
Cancers (Basel). 2023 Jun 28;15(13):3392. doi: 10.3390/cancers15133392.
The aim of this study was to develop a novel deep learning (DL) model without requiring large-annotated training datasets for detecting pancreatic cancer (PC) using computed tomography (CT) images. This retrospective diagnostic study was conducted using CT images collected from 2004 and 2019 from 4287 patients diagnosed with PC. We proposed a self-supervised learning algorithm (pseudo-lesion segmentation (PS)) for PC classification, which was trained with and without PS and validated on randomly divided training and validation sets. We further performed cross-racial external validation using open-access CT images from 361 patients. For internal validation, the accuracy and sensitivity for PC classification were 94.3% (92.8-95.4%) and 92.5% (90.0-94.4%), and 95.7% (94.5-96.7%) and 99.3 (98.4-99.7%) for the convolutional neural network (CNN) and transformer-based DL models (both with PS), respectively. Implementing PS on a small-sized training dataset (randomly sampled 10%) increased accuracy by 20.5% and sensitivity by 37.0%. For external validation, the accuracy and sensitivity were 82.5% (78.3-86.1%) and 81.7% (77.3-85.4%) and 87.8% (84.0-90.8%) and 86.5% (82.3-89.8%) for the CNN and transformer-based DL models (both with PS), respectively. PS self-supervised learning can increase DL-based PC classification performance, reliability, and robustness of the model for unseen, and even small, datasets. The proposed DL model is potentially useful for PC diagnosis.
本研究的目的是开发一种新型深度学习(DL)模型,该模型无需大量带注释的训练数据集,即可使用计算机断层扫描(CT)图像检测胰腺癌(PC)。这项回顾性诊断研究使用了2004年至2019年收集的4287例被诊断为PC患者的CT图像。我们提出了一种用于PC分类的自监督学习算法(伪病变分割(PS)),该算法在有和没有PS的情况下进行训练,并在随机划分的训练集和验证集上进行验证。我们进一步使用361例患者的开放获取CT图像进行跨种族外部验证。对于内部验证,卷积神经网络(CNN)和基于Transformer的DL模型(均使用PS)对PC分类的准确率和灵敏度分别为94.3%(92.8-95.4%)和92.5%(90.0-94.4%),以及95.7%(94.5-96.7%)和99.3(98.4-99.7%)。在小型训练数据集(随机抽取10%)上实施PS可使准确率提高20.5%,灵敏度提高37.0%。对于外部验证,CNN和基于Transformer的DL模型(均使用PS)的准确率和灵敏度分别为82.5%(78.3-86.1%)和81.7%(77.3-85.4%),以及87.8%(84.0-90.8%)和86.5%(82.3-89.8%)。PS自监督学习可以提高基于DL的PC分类性能、模型对未见数据集甚至小数据集的可靠性和鲁棒性。所提出的DL模型对PC诊断可能具有潜在的实用性。