Shenouda Mena, Gudmundsson Eyjólfur, Li Feng, Straus Christopher M, Kindler Hedy L, Dudek Arkadiusz Z, Stinchcombe Thomas, Wang Xiaofei, Starkey Adam, Armato Samuel G
Department of Radiology, The University of Chicago, Chicago, IL, USA.
Icelandic Radiation Safety Office, Reykjavik, Iceland.
ArXiv. 2023 Nov 30:arXiv:2312.00223v1.
Malignant pleural mesothelioma (MPM) is the most common form of malignant mesothelioma, with exposure to asbestos being the primary cause of the disease. To assess response to treatment, tumor measurements are acquired and evaluated based on a patient's longitudinal computed tomography (CT) scans. Tumor volume, however, is the more accurate metric for assessing tumor burden and response. Automated segmentation methods using deep learning can be employed to acquire volume, which otherwise is a tedious task performed manually. The deep learning-based tumor volume and contours can then be compared with a standard reference to assess the robustness of the automated segmentations. The purpose of this study was to evaluate the impact of probability map threshold on MPM tumor delineations generated using a convolutional neural network (CNN). Eighty-eight CT scans from 21 MPM patients were segmented by a VGG16/U-Net CNN. A radiologist modified the contours generated at a 0.5 probability threshold. Percent difference of tumor volume and overlap using the Dice Similarity Coefficient (DSC) were compared between the standard reference provided by the radiologist and CNN outputs for thresholds ranging from 0.001 to 0.9. CNN annotations consistently yielded smaller tumor volumes than radiologist contours. Reducing the probability threshold from 0.5 to 0.1 decreased the absolute percent volume difference, on average, from 43.96% to 24.18%. Median and mean DSC ranged from 0.58 to 0.60, with a peak at a threshold of 0.5; no distinct threshold was found for percent volume difference. The CNN exhibited deficiencies with specific disease presentations, such as severe pleural effusion or disease in the pleural fissure. No single output threshold in the CNN probability maps was optimal for both tumor volume and DSC. This study emphasized the importance of considering both figures of merit when evaluating deep learning-based tumor segmentations across probability thresholds. This work underscores the need to simultaneously assess tumor volume and spatial overlap when evaluating CNN performance. While automated segmentations may yield comparable tumor volumes to that of the reference standard, the spatial region delineated by the CNN at a specific threshold is equally important.
恶性胸膜间皮瘤(MPM)是恶性间皮瘤最常见的形式,接触石棉是该疾病的主要病因。为了评估治疗反应,需根据患者的纵向计算机断层扫描(CT)图像获取并评估肿瘤测量值。然而,肿瘤体积是评估肿瘤负荷和反应的更准确指标。可采用基于深度学习的自动分割方法来获取体积,否则这将是一项繁琐的手动任务。然后可将基于深度学习的肿瘤体积和轮廓与标准参考进行比较,以评估自动分割的稳健性。本研究的目的是评估概率图阈值对使用卷积神经网络(CNN)生成的MPM肿瘤轮廓的影响。来自21例MPM患者的88份CT图像由VGG16/U-Net CNN进行分割。一名放射科医生修改了在概率阈值为0.5时生成的轮廓。比较了放射科医生提供的标准参考与CNN在0.001至0.9阈值范围内输出的肿瘤体积百分比差异和使用骰子相似系数(DSC)的重叠情况。CNN标注的肿瘤体积始终小于放射科医生勾勒的轮廓。将概率阈值从0.5降低到0.1,平均绝对体积百分比差异从43.96%降至24.18%。中位数和平均DSC范围为0.58至0.60,在阈值为0.5时达到峰值;未发现体积百分比差异的明显阈值。CNN在特定疾病表现方面存在不足,如严重胸腔积液或胸膜裂中的疾病。CNN概率图中没有一个单一的输出阈值对于肿瘤体积和DSC都是最佳的。本研究强调了在评估跨概率阈值的基于深度学习的肿瘤分割时考虑这两个品质因数的重要性。这项工作强调了在评估CNN性能时同时评估肿瘤体积和空间重叠的必要性。虽然自动分割可能产生与参考标准相当的肿瘤体积,但CNN在特定阈值下勾勒的空间区域同样重要。