Department of Pathology, Radboud University Medical Center, Nijmegen, Netherlands.
Department of Pathology, Canisius Wilhelmina Hospital, Nijmegen, Netherlands.
Lab Invest. 2019 Nov;99(11):1596-1606. doi: 10.1038/s41374-019-0275-0. Epub 2019 Jun 20.
As part of routine histological grading, for every invasive breast cancer the mitotic count is assessed by counting mitoses in the (visually selected) region with the highest proliferative activity. Because this procedure is prone to subjectivity, the present study compares visual mitotic counting with deep learning based automated mitotic counting and fully automated hotspot selection. Two cohorts were used in this study. Cohort A comprised 90 prospectively included tumors which were selected based on the mitotic frequency scores given during routine glass slide diagnostics. This pathologist additionally assessed the mitotic count in these tumors in whole slide images (WSI) within a preselected hotspot. A second observer performed the same procedures on this cohort. The preselected hotspot was generated by a convolutional neural network (CNN) trained to detect all mitotic figures in digitized hematoxylin and eosin (H&E) sections. The second cohort comprised a multicenter, retrospective TNBC cohort (n = 298), of which the mitotic count was assessed by three independent observers on glass slides. The same CNN was applied on this cohort and the absolute number of mitotic figures in the hotspot was compared to the averaged mitotic count of the observers. Baseline interobserver agreement for glass slide assessment in cohort A was good (kappa 0.689; 95% CI 0.580-0.799). Using the CNN generated hotspot in WSI, the agreement score increased to 0.814 (95% CI 0.719-0.909). Automated counting by the CNN in comparison with observers counting in the predefined hotspot region yielded an average kappa of 0.724. We conclude that manual mitotic counting is not affected by assessment modality (glass slides, WSI) and that counting mitotic figures in WSI is feasible. Using a predefined hotspot area considerably improves reproducibility. Also, fully automated assessment of mitotic score appears to be feasible without introducing additional bias or variability.
作为常规组织学分级的一部分,通过在(视觉选择的)具有最高增殖活性的区域中计数有丝分裂来评估每个浸润性乳腺癌的有丝分裂计数。由于该程序容易受到主观性的影响,因此本研究比较了视觉有丝分裂计数与基于深度学习的自动有丝分裂计数和全自动热点选择。本研究使用了两个队列。队列 A 由 90 例前瞻性纳入的肿瘤组成,这些肿瘤是根据常规玻璃载玻片诊断中给出的有丝分裂频率评分选择的。该病理学家还在预先选择的热点中评估了这些肿瘤的有丝分裂计数。第二位观察者对该队列进行了相同的操作。该预选择的热点是由一个卷积神经网络(CNN)生成的,该网络经过训练可以检测数字化苏木精和伊红(H&E)切片中的所有有丝分裂图。第二个队列由一个多中心、回顾性三阴性乳腺癌(TNBC)队列组成(n=298),该队列的有丝分裂计数由三位独立观察者在玻璃载玻片上评估。同一 CNN 应用于该队列,比较了热点中绝对有丝分裂数与观察者平均有丝分裂数。队列 A 中玻璃载玻片评估的基线观察者间一致性良好(kappa 值为 0.689;95%置信区间为 0.580-0.799)。使用 CNN 生成的 WSI 中的热点,一致性评分提高到 0.814(95%置信区间为 0.719-0.909)。CNN 与观察者在预定义热点区域计数相比,自动计数的平均 kappa 值为 0.724。我们得出结论,手动有丝分裂计数不受评估方式(玻璃载玻片、WSI)的影响,并且在 WSI 中计数有丝分裂数是可行的。使用预定义的热点区域可显著提高重现性。此外,全自动评估有丝分裂评分似乎是可行的,不会引入额外的偏差或变异性。