From the Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, United Kingdom (A.G.R., X.L., I.L., T.D.B., N.B., G.J.W., N.S., A.L., A.H., E.O.A.); Imaging Department, Imperial College Healthcare NHS Trust, London, United Kingdom (A.G.R., T.D.B., N.B., A.S., J.B., A.F., N.S., K.W., A.L., J.R., M.S.); Imperial Clinical Trials Unit, Imperial College London, London, United Kingdom (N.J., S.S., A.T.P., J.W., X.L.); Centre for Medical Imaging, University College London, London, United Kingdom (S.P., H.S., A.P., S.T.); Department of Radiology, University College London Hospital, London, United Kingdom (S.P., H.S., A.P., S.T.); Cancer Imaging, School of Biomedical Engineering & Imaging Sciences, King's College London and Department of Radiology, Guy's & St Thomas' Hospitals NHS Foundation Trust, London, United Kingdom (V.G., C.K.-M.); Department of Biomedical Imaging and Image-Guided Therapy, Medical University of Vienna, Vienna General Hospital, Vienna, Austria (G.J.W.); Royal Marsden NHS Foundation Trust and The Institute of Cancer Research, London, United Kingdom (D.-M.K., C.M., N.T., N.S., C.K.-M., K.N.D.P.); Cancer Research UK and University College London Cancer Trials Unit, London, United Kingdom (K.R.); Faculty of Engineering, Department of Computing, Imperial College London, London, United Kingdom (Q.D., B.G.); King's Cancer Prevention Group, School of Cancer and Pharmaceutical Sciences, King's College London, London, United Kingdom (J.W.); Department of Radiology, Homerton NHS Foundation Trust, London, United Kingdom (P.B.); Paul Strickland Scanner Centre, Mount Vernon Hospital (H.S.); Department of Radiology, The Hillingdon Hospitals NHS Foundation Trust, London, United Kingdom (H.S.); Thirlestaine Breast Centre, Gloucestershire Hospitals NHS Foundation Trust, London, United Kingdom (S.V.); and Nightingale-Saunders Clinical Trials & Epidemiology Unit, King's Clinical Trials Unit, London, United Kingdom (A.T.P.).
Invest Radiol. 2023 Dec 1;58(12):823-831. doi: 10.1097/RLI.0000000000000996. Epub 2023 Jun 26.
Whole-body magnetic resonance imaging (WB-MRI) has been demonstrated to be efficient and cost-effective for cancer staging. The study aim was to develop a machine learning (ML) algorithm to improve radiologists' sensitivity and specificity for metastasis detection and reduce reading times.
A retrospective analysis of 438 prospectively collected WB-MRI scans from multicenter Streamline studies (February 2013-September 2016) was undertaken. Disease sites were manually labeled using Streamline reference standard. Whole-body MRI scans were randomly allocated to training and testing sets. A model for malignant lesion detection was developed based on convolutional neural networks and a 2-stage training strategy. The final algorithm generated lesion probability heat maps. Using a concurrent reader paradigm, 25 radiologists (18 experienced, 7 inexperienced in WB-/MRI) were randomly allocated WB-MRI scans with or without ML support to detect malignant lesions over 2 or 3 reading rounds. Reads were undertaken in the setting of a diagnostic radiology reading room between November 2019 and March 2020. Reading times were recorded by a scribe. Prespecified analysis included sensitivity, specificity, interobserver agreement, and reading time of radiology readers to detect metastases with or without ML support. Reader performance for detection of the primary tumor was also evaluated.
Four hundred thirty-three evaluable WB-MRI scans were allocated to algorithm training (245) or radiology testing (50 patients with metastases, from primary 117 colon [n = 117] or lung [n = 71] cancer). Among a total 562 reads by experienced radiologists over 2 reading rounds, per-patient specificity was 86.2% (ML) and 87.7% (non-ML) (-1.5% difference; 95% confidence interval [CI], -6.4%, 3.5%; P = 0.39). Sensitivity was 66.0% (ML) and 70.0% (non-ML) (-4.0% difference; 95% CI, -13.5%, 5.5%; P = 0.344). Among 161 reads by inexperienced readers, per-patient specificity in both groups was 76.3% (0% difference; 95% CI, -15.0%, 15.0%; P = 0.613), with sensitivity of 73.3% (ML) and 60.0% (non-ML) (13.3% difference; 95% CI, -7.9%, 34.5%; P = 0.313). Per-site specificity was high (>90%) for all metastatic sites and experience levels. There was high sensitivity for the detection of primary tumors (lung cancer detection rate of 98.6% with and without ML [0.0% difference; 95% CI, -2.0%, 2.0%; P = 1.00], colon cancer detection rate of 89.0% with and 90.6% without ML [-1.7% difference; 95% CI, -5.6%, 2.2%; P = 0.65]). When combining all reads from rounds 1 and 2, reading times fell by 6.2% (95% CI, -22.8%, 10.0%) when using ML. Round 2 read-times fell by 32% (95% CI, 20.8%, 42.8%) compared with round 1. Within round 2, there was a significant decrease in read-time when using ML support, estimated as 286 seconds (or 11%) quicker ( P = 0.0281), using regression analysis to account for reader experience, read round, and tumor type. Interobserver variance suggests moderate agreement, Cohen κ = 0.64; 95% CI, 0.47, 0.81 (with ML), and Cohen κ = 0.66; 95% CI, 0.47, 0.81 (without ML).
There was no evidence of a significant difference in per-patient sensitivity and specificity for detecting metastases or the primary tumor using concurrent ML compared with standard WB-MRI. Radiology read-times with or without ML support fell for round 2 reads compared with round 1, suggesting that readers familiarized themselves with the study reading method. During the second reading round, there was a significant reduction in reading time when using ML support.
全身磁共振成像(WB-MRI)已被证明在癌症分期方面是高效且具有成本效益的。本研究旨在开发一种机器学习(ML)算法,以提高放射科医生对转移灶检测的敏感性和特异性,并减少阅读时间。
对来自多中心 Streamline 研究的 438 例前瞻性收集的 WB-MRI 扫描进行回顾性分析(2013 年 2 月至 2016 年 9 月)。使用 Streamline 参考标准手动标记病变部位。全身 MRI 扫描随机分配到训练集和测试集。基于卷积神经网络和两阶段训练策略,开发了一种用于恶性病变检测的模型。最终算法生成病变概率热图。使用并发读者范例,将 25 名放射科医生(18 名经验丰富,7 名不熟悉 WB-/MRI)随机分配到带有或不带有 ML 支持的 WB-MRI 扫描中,以在 2 或 3 轮阅读中检测恶性病变。阅读是在诊断放射学阅读室中进行的,时间为 2019 年 11 月至 2020 年 3 月。记录阅读时间由记录员完成。预设分析包括放射科医生在有或没有 ML 支持的情况下检测转移的敏感性、特异性、观察者间一致性和阅读时间。还评估了检测原发性肿瘤的性能。
将 433 例可评估的 WB-MRI 扫描分配给算法训练(245 例)或放射学测试(50 例转移患者,来自原发性 117 例结肠癌[ n = 117 ]或肺癌[ n = 71 ])。在经验丰富的放射科医生进行的两轮阅读中,共进行了 562 次阅读,每位患者的特异性分别为 86.2%(ML)和 87.7%(非-ML)(差异为-1.5%;95%置信区间[CI],-6.4%,3.5%;P = 0.39)。敏感性分别为 66.0%(ML)和 70.0%(非-ML)(差异为-4.0%;95%CI,-13.5%,5.5%;P = 0.344)。在 161 次经验不足的读者阅读中,两组患者的每位患者特异性均为 76.3%(差异为 0%;95%CI,-15.0%,15.0%;P = 0.613),敏感性分别为 73.3%(ML)和 60.0%(非-ML)(差异为 13.3%;95%CI,-7.9%,34.5%;P = 0.313)。所有转移性部位和经验水平的病变特异性均较高(>90%)。原发性肿瘤的检测具有较高的敏感性(肺癌检测率为 98.6%,有或没有 ML [差异为 0.0%;95%CI,-2.0%,2.0%;P = 1.00],结肠癌检测率为 89.0%,有或没有 ML [差异为-1.7%;95%CI,-5.6%,2.2%;P = 0.65])。当结合第 1 轮和第 2 轮的所有阅读时,使用 ML 支持可使阅读时间减少 6.2%(95%CI,-22.8%,10.0%)。与第 1 轮相比,第 2 轮的阅读时间下降了 32%(95%CI,20.8%,42.8%)。在第 2 轮中,使用 ML 支持时,阅读时间明显减少,估计减少 286 秒(或 11%)( P = 0.0281),使用回归分析考虑了读者的经验、阅读轮次和肿瘤类型。观察者间的方差表明存在中度一致性,Cohen κ = 0.64;95%CI,0.47,0.81(有 ML),和 Cohen κ = 0.66;95%CI,0.47,0.81(无 ML)。
与标准 WB-MRI 相比,使用并发 ML 检测转移灶或原发性肿瘤的患者敏感性和特异性没有明显差异。与第 1 轮相比,第 2 轮的阅读时间有或没有 ML 支持均有所下降,这表明读者熟悉了研究的阅读方法。在第二轮阅读中,使用 ML 支持时,阅读时间显著减少。