Pantanowitz Liron, Quiroga-Garza Gabriela M, Bien Lilach, Heled Ronen, Laifenfeld Daphna, Linhart Chaim, Sandbank Judith, Albrecht Shach Anat, Shalev Varda, Vecsler Manuela, Michelow Pamela, Hazelhurst Scott, Dhir Rajiv
Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA; Department of Anatomical Pathology, University of the Witwatersrand and National Health Laboratory Services, Johannesburg, South Africa.
Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA.
Lancet Digit Health. 2020 Aug;2(8):e407-e416. doi: 10.1016/S2589-7500(20)30159-X.
There is high demand to develop computer-assisted diagnostic tools to evaluate prostate core needle biopsies (CNBs), but little clinical validation and a lack of clinical deployment of such tools. We report here on a blinded clinical validation study and deployment of an artificial intelligence (AI)-based algorithm in a pathology laboratory for routine clinical use to aid prostate diagnosis.
An AI-based algorithm was developed using haematoxylin and eosin (H&E)-stained slides of prostate CNBs digitised with a Philips scanner, which were divided into training (1 357 480 image patches from 549 H&E-stained slides) and internal test (2501 H&E-stained slides) datasets. The algorithm provided slide-level scores for probability of cancer, Gleason score 7-10 (vs Gleason score 6 or atypical small acinar proliferation [ASAP]), Gleason pattern 5, and perineural invasion and calculation of cancer percentage present in CNB material. The algorithm was subsequently validated on an external dataset of 100 consecutive cases (1627 H&E-stained slides) digitised on an Aperio AT2 scanner. In addition, the AI tool was implemented in a pathology laboratory within routine clinical workflow as a second read system to review all prostate CNBs. Algorithm performance was assessed with area under the receiver operating characteristic curve (AUC), specificity, and sensitivity, as well as Pearson's correlation coefficient (Pearson's r) for cancer percentage.
The algorithm achieved an AUC of 0·997 (95% CI 0·995 to 0·998) for cancer detection in the internal test set and 0·991 (0·979 to 1·00) in the external validation set. The AUC for distinguishing between a low-grade (Gleason score 6 or ASAP) and high-grade (Gleason score 7-10) cancer diagnosis was 0·941 (0·905 to 0·977) and the AUC for detecting Gleason pattern 5 was 0·971 (0·943 to 0·998) in the external validation set. Cancer percentage calculated by pathologists and the algorithm showed good agreement (r=0·882, 95% CI 0·834 to 0·915; p<0·0001) with a mean bias of -4·14% (-6·36 to -1·91). The algorithm achieved an AUC of 0·957 (0·930 to 0·985) for perineural invasion. In routine practice, the algorithm was used to assess 11 429 H&E-stained slides pertaining to 941 cases leading to 90 Gleason score 7-10 alerts and 560 cancer alerts. 51 (9%) cancer alerts led to additional cuts or stains being ordered, two (4%) of which led to a third opinion request. We report on the first case of missed cancer that was detected by the algorithm.
This study reports the successful development, external clinical validation, and deployment in clinical practice of an AI-based algorithm to accurately detect, grade, and evaluate clinically relevant findings in digitised slides of prostate CNBs.
Ibex Medical Analytics.
开发计算机辅助诊断工具以评估前列腺穿刺活检(CNB)的需求很高,但此类工具的临床验证很少,且缺乏临床应用。我们在此报告一项盲法临床验证研究,并在病理实验室中部署了一种基于人工智能(AI)的算法,用于常规临床使用以辅助前列腺诊断。
使用飞利浦扫描仪数字化的前列腺CNB苏木精和伊红(H&E)染色玻片开发了一种基于AI的算法,这些玻片被分为训练集(来自549张H&E染色玻片的1357480个图像块)和内部测试集(2501张H&E染色玻片)。该算法提供了癌症概率、Gleason评分7 - 10(对比Gleason评分6或非典型小腺泡增生[ASAP])、Gleason模式5和神经周围侵犯的玻片级评分,并计算CNB材料中存在的癌症百分比。随后,该算法在Aperio AT2扫描仪上数字化的100例连续病例(1627张H&E染色玻片)的外部数据集上进行了验证。此外,该AI工具在病理实验室的常规临床工作流程中作为二次阅片系统实施,以审查所有前列腺CNB。通过受试者操作特征曲线下面积(AUC)、特异性和敏感性以及癌症百分比的Pearson相关系数(Pearson's r)评估算法性能。
该算法在内部测试集中癌症检测的AUC为0.997(95%CI 0.995至0.998),在外部验证集中为0.99(0.979至1.00)。在外部验证集中,区分低级别(Gleason评分6或ASAP)和高级别(Gleason评分7 - 10)癌症诊断的AUC为0.941(0.905至0.977),检测Gleason模式5的AUC为0.971(0.943至0.998)。病理学家计算的癌症百分比与算法显示出良好一致性(r = 0.882,95%CI 0.834至0.915;p < 0.0001),平均偏差为 - 4.14%( - 6.36至 - 1.91)。该算法在神经周围侵犯方面的AUC为0.957(0.930至0.985)。在常规实践中,该算法用于评估与941例病例相关的11429张H&E染色玻片,产生了90次Gleason评分7 - 10警报和560次癌症警报。51次(9%)癌症警报导致额外的切片或染色检查,其中2次(4%)导致寻求第三方意见。我们报告了该算法检测出的第一例漏诊癌症病例。
本研究报告了一种基于AI的算法的成功开发、外部临床验证及其在临床实践中的部署,该算法可准确检测、分级并评估前列腺CNB数字化玻片中的临床相关发现。
Ibex Medical Analytics。