Selby Heather M, Mukherjee Pritam, Parham Christopher, Malik Sachin B, Gevaert Olivier, Napel Sandy, Shah Rajesh P
Stanford University School of Medicine, Stanford Center for Biomedical Informatics (BMIR), Stanford, California, United States.
National Institutes of Health Clinical Center, Bethesda, Maryland, United States.
J Med Imaging (Bellingham). 2023 Jul;10(4):044006. doi: 10.1117/1.JMI.10.4.044006. Epub 2023 Aug 9.
We aim to evaluate the performance of radiomic biopsy (RB), best-fit bounding box (BB), and a deep-learning-based segmentation method called no-new-U-Net (nnU-Net), compared to the standard full manual (FM) segmentation method for predicting benign and malignant lung nodules using a computed tomography (CT) radiomic machine learning model.
A total of 188 CT scans of lung nodules from 2 institutions were used for our study. One radiologist identified and delineated all 188 lung nodules, whereas a second radiologist segmented a subset () of these nodules. Both radiologists employed FM and RB segmentation methods. BB segmentations were generated computationally from the FM segmentations. The nnU-Net, a deep-learning-based segmentation method, performed automatic nodule detection and segmentation. The time radiologists took to perform segmentations was recorded. Radiomic features were extracted from each segmentation method, and models to predict benign and malignant lung nodules were developed. The Kruskal-Wallis and DeLong tests were used to compare segmentation times and areas under the curve (AUC), respectively.
For the delineation of the FM, RB, and BB segmentations, the two radiologists required a median time (IQR) of 113 (54 to 251.5), 21 (9.25 to 38), and 16 (12 to 64.25) s, respectively (). In dataset 1, the mean AUC (95% CI) of the FM, RB, BB, and nnU-Net model were 0.964 (0.96 to 0.968), 0.985 (0.983 to 0.987), 0.961 (0.956 to 0.965), and 0.878 (0.869 to 0.888). In dataset 2, the mean AUC (95% CI) of the FM, RB, BB, and nnU-Net model were 0.717 (0.705 to 0.729), 0.919 (0.913 to 0.924), 0.699 (0.687 to 0.711), and 0.644 (0.632 to 0.657).
Radiomic biopsy-based models outperformed FM and BB models in prediction of benign and malignant lung nodules in two independent datasets while deep-learning segmentation-based models performed similarly to FM and BB. RB could be a more efficient segmentation method, but further validation is needed.
我们旨在评估放射组学活检(RB)、最佳拟合边界框(BB)以及一种名为无新U-Net(nnU-Net)的基于深度学习的分割方法与标准全手动(FM)分割方法相比,在使用计算机断层扫描(CT)放射组学机器学习模型预测良性和恶性肺结节方面的性能。
我们的研究使用了来自2个机构的188例肺结节CT扫描。一位放射科医生识别并勾勒出所有188个肺结节,而另一位放射科医生对其中一部分结节进行分割。两位放射科医生均采用FM和RB分割方法。BB分割是通过FM分割在计算机上生成的。nnU-Net是一种基于深度学习的分割方法,可进行自动结节检测和分割。记录放射科医生进行分割所需的时间。从每种分割方法中提取放射组学特征,并建立预测良性和恶性肺结节的模型。分别使用Kruskal-Wallis检验和DeLong检验来比较分割时间和曲线下面积(AUC)。
对于FM、RB和BB分割的勾勒,两位放射科医生所需的中位时间(IQR)分别为113(54至251.5)、21(9.25至38)和16(12至64.25)秒()。在数据集1中,FM、RB、BB和nnU-Net模型的平均AUC(95%CI)分别为0.964(0.96至0.968)、0.985(0.983至0.987)、0.961(0.956至0.965)和0.878(0.869至0.888)。在数据集2中,FM、RB、BB和nnU-Net模型的平均AUC(95%CI)分别为0.717(0.705至0.729)、0.919(0.913至0.924)、0.699(0.687至0.711)和0.644(0.632至0.657)。
在两个独立数据集中,基于放射组学活检的模型在预测良性和恶性肺结节方面优于FM和BB模型,而基于深度学习分割的模型与FM和BB模型表现相似。RB可能是一种更有效的分割方法,但需要进一步验证。