Glasgow Pleural Disease Unit, Queen Elizabeth University Hospital, Glasgow, UK.
School of Computing Science, University of Glasgow, Glasgow, UK.
Thorax. 2022 Dec;77(12):1251-1259. doi: 10.1136/thoraxjnl-2021-217808. Epub 2022 Feb 2.
In malignant pleural mesothelioma (MPM), complex tumour morphology results in inconsistent radiological response assessment. Promising volumetric methods require automation to be practical. We developed a fully automated Convolutional Neural Network (CNN) for this purpose, performed blinded validation and compared CNN and human response classification and survival prediction in patients treated with chemotherapy.
In a multicentre retrospective cohort study; 183 CT datasets were split into training and internal validation (123 datasets (80 fully annotated); 108 patients; 1 centre) and external validation (60 datasets (all fully annotated); 30 patients; 3 centres). Detailed manual annotations were used to train the CNN, which used two-dimensional U-Net architecture. CNN performance was evaluated using correlation, Bland-Altman and Dice agreement. Volumetric response/progression were defined as ≤30%/≥20% change and compared with modified Response Evaluation Criteria In Solid Tumours (mRECIST) by Cohen's kappa. Survival was assessed using Kaplan-Meier methodology.
Human and artificial intelligence (AI) volumes were strongly correlated (validation set r=0.851, p<0.0001). Agreement was strong (validation set mean bias +31 cm (p=0.182), 95% limits 345 to +407 cm). Infrequent AI segmentation errors (4/60 validation cases) were associated with fissural tumour, contralateral pleural thickening and adjacent atelectasis. Human and AI volumetric responses agreed in 20/30 (67%) validation cases κ=0.439 (0.178 to 0.700). AI and mRECIST agreed in 16/30 (55%) validation cases κ=0.284 (0.026 to 0.543). Higher baseline tumour volume was associated with shorter survival.
We have developed and validated the first fully automated CNN for volumetric MPM segmentation. CNN performance may be further improved by enriching future training sets with morphologically challenging features. Volumetric response thresholds require further calibration in future studies.
在恶性胸膜间皮瘤(MPM)中,复杂的肿瘤形态导致影像学反应评估不一致。有前途的体积方法需要自动化才能实用。为此,我们开发了一种完全自动化的卷积神经网络(CNN),并进行了盲法验证,并比较了化疗治疗患者的 CNN 和人类反应分类以及生存预测。
在一项多中心回顾性队列研究中;将 183 个 CT 数据集分为训练和内部验证(123 个数据集(80 个完全注释);108 个患者;1 个中心)和外部验证(60 个数据集(全部完全注释);30 个患者;3 个中心)。详细的手动注释用于训练 CNN,该 CNN 使用二维 U-Net 架构。使用相关性、Bland-Altman 和 Dice 一致性评估 CNN 性能。体积反应/进展定义为≤30%/≥20%的变化,并通过 Cohen's kappa 与改良的实体瘤反应评估标准(mRECIST)进行比较。使用 Kaplan-Meier 方法评估生存情况。
人类和人工智能(AI)体积具有很强的相关性(验证集 r=0.851,p<0.0001)。一致性很强(验证集平均偏差为+31 cm(p=0.182),95%置信区间为+407 cm)。人工智能分割偶尔出现错误(验证集中的 4/60 个病例)与裂隙肿瘤、对侧胸膜增厚和相邻肺不张有关。在 30 个验证病例中,有 20 个(67%)病例的人类和人工智能体积反应一致,κ=0.439(0.178 至 0.700)。在 30 个验证病例中,有 16 个(55%)病例的人工智能和 mRECIST 一致,κ=0.284(0.026 至 0.543)。较高的基线肿瘤体积与较短的生存时间相关。
我们已经开发并验证了第一个用于 MPM 体积分割的全自动 CNN。通过在未来的训练集中丰富具有形态挑战性的特征,CNN 的性能可能会进一步提高。在未来的研究中,需要进一步校准体积反应阈值。