Robotics and Tech. of Computers Lab., Universidad de Sevilla, 41012 Seville, Spain; Escuela Técnica Superior de Ingeniería Informática (ETSII), Avenida de Reina Mercedes s/n, Universidad de Sevilla, 41012 Seville, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, 41011 Seville, Spain.
Robotics and Tech. of Computers Lab., Universidad de Sevilla, 41012 Seville, Spain; Escuela Técnica Superior de Ingeniería Informática (ETSII), Avenida de Reina Mercedes s/n, Universidad de Sevilla, 41012 Seville, Spain; Escuela Politécnica Superior (EPS), Universidad de Sevilla, 41011 Seville, Spain; Smart Computer Systems Research and Engineering Lab (SCORE), Research Institute of Computer Engineering (I3US), Universidad de Sevilla, 41012 Seville, Spain.
Comput Biol Med. 2023 Jun;159:106856. doi: 10.1016/j.compbiomed.2023.106856. Epub 2023 Apr 6.
Among all the cancers known today, prostate cancer is one of the most commonly diagnosed in men. With modern advances in medicine, its mortality has been considerably reduced. However, it is still a leading type of cancer in terms of deaths. The diagnosis of prostate cancer is mainly conducted by biopsy test. From this test, Whole Slide Images are obtained, from which pathologists diagnose the cancer according to the Gleason scale. Within this scale from 1 to 5, grade 3 and above is considered malignant tissue. Several studies have shown an inter-observer discrepancy between pathologists in assigning the value of the Gleason scale. Due to the recent advances in artificial intelligence, its application to the computational pathology field with the aim of supporting and providing a second opinion to the professional is of great interest.
In this work, the inter-observer variability of a local dataset of 80 whole-slide images annotated by a team of 5 pathologists from the same group was analyzed at both area and label level. Four approaches were followed to train six different Convolutional Neural Network architectures, which were evaluated on the same dataset on which the inter-observer variability was analyzed.
An inter-observer variability of 0.6946 κ was obtained, with 46% discrepancy in terms of area size of the annotations performed by the pathologists. The best trained models achieved 0.826±0.014κ on the test set when trained with data from the same source.
The obtained results show that deep learning-based automatic diagnosis systems could help reduce the widely-known inter-observer variability that is present among pathologists and support them in their decision, serving as a second opinion or as a triage tool for medical centers.
在当今已知的所有癌症中,前列腺癌是男性最常见的诊断之一。随着现代医学的进步,其死亡率已经大大降低。然而,它仍然是死亡人数最多的癌症类型。前列腺癌的诊断主要通过活检测试进行。从这个测试中,可以获得全切片图像,病理学家根据 Gleason 量表对癌症进行诊断。在这个从 1 到 5 的等级中,3 级及以上被认为是恶性组织。一些研究表明,病理学家在分配 Gleason 量表值方面存在观察者间差异。由于人工智能的最新进展,将其应用于计算病理学领域,旨在为专业人员提供支持和辅助意见,引起了极大的兴趣。
在这项工作中,分析了由同一组的 5 名病理学家组成的团队对 80 张全切片图像的局部数据集进行注释的观察者间变异性,从区域和标签两个层面进行分析。我们采用了四种方法来训练六个不同的卷积神经网络架构,并在对观察者间变异性进行分析的相同数据集上对其进行评估。
我们得到了 0.6946 κ 的观察者间变异性,病理学家在注释区域大小方面有 46%的差异。当使用来自同一来源的数据进行训练时,最佳训练模型在测试集上的准确率达到 0.826±0.014 κ。
研究结果表明,基于深度学习的自动诊断系统可以帮助减少病理学家之间广泛存在的已知观察者间变异性,并支持他们的决策,作为医疗中心的辅助诊断或分诊工具。