Song Zhigang, Yu Chunkai, Zou Shuangmei, Wang Wenmiao, Huang Yong, Ding Xiaohui, Liu Jinhong, Shao Liwei, Yuan Jing, Gou Xiangnan, Jin Wei, Wang Zhanbo, Chen Xin, Chen Huang, Liu Cancheng, Xu Gang, Sun Zhuo, Ku Calvin, Zhang Yongqiang, Dong Xianghui, Wang Shuhao, Xu Wei, Lv Ning, Shi Huaiyin
Department of Pathology, Chinese PLA General Hospital, Beijing, China.
Department of Pathology, Capital Medical University Affiliated Beijing Shijitan Hospital, Beijing, China.
BMJ Open. 2020 Sep 10;10(9):e036423. doi: 10.1136/bmjopen-2019-036423.
The microscopic evaluation of slides has been gradually moving towards all digital in recent years, leading to the possibility for computer-aided diagnosis. It is worthwhile to know the similarities between deep learning models and pathologists before we put them into practical scenarios. The simple criteria of colorectal adenoma diagnosis make it to be a perfect testbed for this study.
The deep learning model was trained by 177 accurately labelled training slides (156 with adenoma). The detailed labelling was performed on a self-developed annotation system based on iPad. We built the model based on DeepLab v2 with ResNet-34. The model performance was tested on 194 test slides and compared with five pathologists. Furthermore, the generalisation ability of the learning model was tested by extra 168 slides (111 with adenoma) collected from two other hospitals.
The deep learning model achieved an area under the curve of 0.92 and obtained a slide-level accuracy of over 90% on slides from two other hospitals. The performance was on par with the performance of experienced pathologists, exceeding the average pathologist. By investigating the feature maps and cases misdiagnosed by the model, we found the concordance of thinking process in diagnosis between the deep learning model and pathologists.
The deep learning model for colorectal adenoma diagnosis is quite similar to pathologists. It is on-par with pathologists' performance, makes similar mistakes and learns rational reasoning logics. Meanwhile, it obtains high accuracy on slides collected from different hospitals with significant staining configuration variations.
近年来,玻片的显微镜评估已逐渐向全数字化发展,这使得计算机辅助诊断成为可能。在将深度学习模型应用于实际场景之前,了解其与病理学家之间的相似之处是很有必要的。结直肠腺瘤诊断的简单标准使其成为本研究的理想测试平台。
深度学习模型由177张标记准确的训练玻片(156张有腺瘤)进行训练。详细的标记是在基于iPad自行开发的注释系统上进行的。我们基于带有ResNet-34的DeepLab v2构建了该模型。该模型的性能在194张测试玻片上进行了测试,并与五位病理学家的表现进行了比较。此外,通过从另外两家医院收集的168张玻片(111张有腺瘤)对学习模型的泛化能力进行了测试。
深度学习模型的曲线下面积达到0.92,在来自另外两家医院的玻片上获得了超过90%的玻片级准确率。其表现与经验丰富的病理学家相当,超过了普通病理学家。通过研究模型的特征图和误诊病例,我们发现深度学习模型与病理学家在诊断思维过程上具有一致性。
用于结直肠腺瘤诊断的深度学习模型与病理学家非常相似。它与病理学家的表现相当,会犯相似的错误,并学习合理的推理逻辑。同时,它在从不同医院收集的、染色配置有显著差异的玻片上也能获得较高的准确率。