Parakh Anushri, Lee Hyunkwang, Lee Jeong Hyun, Eisner Brian H, Sahani Dushyant V, Do Synho
Departments of Radiology (A.P., H.L., D.V.S., S.D.) and Urology (B.H.E.), Massachusetts General Hospital, 55 Fruit St, White 270, Boston, MA 02114; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Mass (H.L.): and Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea (J.H.L.).
Radiol Artif Intell. 2019 Jul 24;1(4):e180066. doi: 10.1148/ryai.2019180066. eCollection 2019 Jul.
To investigate the diagnostic accuracy of cascading convolutional neural network (CNN) for urinary stone detection on unenhanced CT images and to evaluate the performance of pretrained models enriched with labeled CT images across different scanners.
This HIPAA-compliant, institutional review board-approved, retrospective clinical study used unenhanced abdominopelvic CT scans from 535 adults suspected of having urolithiasis. The scans were obtained on two scanners (scanner 1 [hereafter S1] and scanner 2 [hereafter S2]). A radiologist reviewed clinical reports and labeled cases for determination of reference standard. Stones were present on 279 (S1, 131; S2, 148) and absent on 256 (S1, 158; S2, 98) scans. One hundred scans (50 from each scanner) were randomly reserved as the test dataset, and the rest were used for developing a cascade of two CNNs: The first CNN identified the extent of the urinary tract, and the second CNN detected presence of stone. Nine variations of models were developed through the combination of different training data sources (S1, S2, or both [hereafter SB]) with (ImageNet, GrayNet) and without (Random) pretrained CNNs. First, models were compared for generalizability at the section level. Second, models were assessed by using area under the receiver operating characteristic curve (AUC) and accuracy at the patient level with test dataset from both scanners ( = 100).
The GrayNet-pretrained model showed higher classifier exactness than did ImageNet-pretrained or Random-initialized models when tested by using data from the same or different scanners at section level. At the patient level, the AUC for stone detection was 0.92-0.95, depending on the model. Accuracy of GrayNet-SB (95%) was higher than that of ImageNet-SB (91%) and Random-SB (88%). For stones larger than 4 mm, all models showed similar performance (false-negative results: two of 34). For stones smaller than 4 mm, the number of false-negative results for GrayNet-SB, ImageNet-SB, and Random-SB were one of 16, three of 16, and five of 16, respectively. GrayNet-SB identified stones in all 22 test cases that had obstructive uropathy.
A cascading model of CNNs can detect urinary tract stones on unenhanced CT scans with a high accuracy (AUC, 0.954). Performance and generalization of CNNs across scanners can be enhanced by using transfer learning with datasets enriched with labeled medical images.© RSNA, 2019
研究级联卷积神经网络(CNN)在未增强CT图像上检测尿路结石的诊断准确性,并评估在不同扫描仪上通过标注CT图像增强的预训练模型的性能。
本符合健康保险流通与责任法案(HIPAA)、经机构审查委员会批准的回顾性临床研究,使用了535例疑似患有尿石症的成人的未增强腹部盆腔CT扫描。扫描在两台扫描仪(扫描仪1[以下简称S1]和扫描仪2[以下简称S2])上进行。一名放射科医生查阅临床报告并标注病例以确定参考标准。279例扫描发现结石(S1,131例;S2,148例),256例扫描未发现结石(S1,158例;S2,98例)。随机保留100例扫描(每台扫描仪50例)作为测试数据集,其余用于开发由两个CNN组成的级联模型:第一个CNN识别尿路范围,第二个CNN检测结石的存在。通过将不同的训练数据源(S1、S2或两者[以下简称SB])与(ImageNet、GrayNet)预训练的CNN以及未预训练的CNN(随机)组合,开发了9种模型变体。首先,在层面水平上比较模型的通用性。其次,使用来自两台扫描仪的测试数据集(n = 100),通过受试者操作特征曲线下面积(AUC)和患者水平的准确性来评估模型。
在层面水平上,当使用来自相同或不同扫描仪的数据进行测试时,GrayNet预训练模型显示出比ImageNet预训练或随机初始化模型更高的分类器准确性。在患者水平上,根据模型不同,结石检测的AUC为0.92 - 0.95。GrayNet - SB(95%)的准确性高于ImageNet - SB(91%)和随机 - SB(88%)。对于直径大于4 mm的结石,所有模型表现相似(假阴性结果:34例中的2例)。对于直径小于4 mm的结石,GrayNet - SB、ImageNet - SB和随机 - SB的假阴性结果数分别为16例中的1例、16例中的3例和16例中的5例。GrayNet - SB在所有22例患有梗阻性尿路病的测试病例中均识别出结石。
CNN级联模型能够在未增强CT扫描上高精度地检测尿路结石(AUC,0.954)。通过对富含标注医学图像的数据集进行迁移学习,可以提高CNN在不同扫描仪之间的性能和通用性。©RSNA,2019