Chronic Disease Research Institute, The Children's Hospital, and National Clinical Research Center for Child Health, School of Public Health, School of Medicine, Zhejiang University, No.866 Yu Hang Tang Road, Hangzhou, 310058, Zhejiang, China.
Department of Nutrition and Food Hygiene, School of Public Health, Zhejiang University, Hangzhou, Zhejiang, China.
Eur Radiol. 2023 Aug;33(8):5894-5906. doi: 10.1007/s00330-023-09515-1. Epub 2023 Mar 9.
We aimed to develop and validate a deep learning system (DLS) by using an auxiliary section that extracts and outputs specific ultrasound diagnostic features to improve the explainable, clinical relevant utility of using DLS for detecting NAFLD.
In a community-based study of 4144 participants with abdominal ultrasound scan in Hangzhou, China, we sampled 928 (617 [66.5%] females, mean age: 56 years ± 13 [standard deviation]) participants (2 images per participant) to develop and validate DLS, a two-section neural network (2S-NNet). Radiologists' consensus diagnosis classified hepatic steatosis as none steatosis, mild, moderate, and severe. We also explored the NAFLD detection performance of six one-section neural network models and five fatty liver indices on our data set. We further evaluated the influence of participants' characteristics on the correctness of 2S-NNet by logistic regression.
Area under the curve (AUROC) of 2S-NNet for hepatic steatosis was 0.90 for ≥ mild, 0.85 for ≥ moderate, and 0.93 for severe steatosis, and was 0.90 for NAFLD presence, 0.84 for moderate to severe NAFLD, and 0.93 for severe NAFLD. The AUROC of NAFLD severity was 0.88 for 2S-NNet, and 0.79-0.86 for one-section models. The AUROC of NAFLD presence was 0.90 for 2S-NNet, and 0.54-0.82 for fatty liver indices. Age, sex, body mass index, diabetes, fibrosis-4 index, android fat ratio, and skeletal muscle via dual-energy X-ray absorptiometry had no significant impact on the correctness of 2S-NNet (p > 0.05).
By using two-section design, 2S-NNet had improved the performance for detecting NAFLD with more explainable, clinical relevant utility than using one-section design.
• Based on the consensus review derived from radiologists, our DLS (2S-NNet) had an AUROC of 0.88 by using two-section design and yielded better performance for detecting NAFLD than using one-section design with more explainable, clinical relevant utility. • The 2S-NNet outperformed five fatty liver indices with the highest AUROCs (0.84-0.93 vs. 0.54-0.82) for different NAFLD severity screening, indicating screening utility of deep learning-based radiology may perform better than blood biomarker panels in epidemiology. • The correctness of 2S-NNet was not significantly influenced by individual's characteristics, including age, sex, body mass index, diabetes, fibrosis-4 index, android fat ratio, and skeletal muscle via dual-energy X-ray absorptiometry.
我们旨在开发和验证一种深度学习系统(DLS),该系统通过使用辅助部分提取和输出特定的超声诊断特征,以提高使用 DLS 检测非酒精性脂肪性肝病(NAFLD)的可解释性和临床相关性。
在一项基于社区的研究中,对中国杭州的 4144 名参与者进行了腹部超声扫描,我们抽取了 928 名(617 名女性[66.5%],平均年龄:56 岁±13[标准差])参与者(每位参与者 2 张图像)来开发和验证 DLS,即两段式神经网络(2S-NNet)。放射科医生的共识诊断将肝脂肪变性分为无脂肪变性、轻度、中度和重度。我们还在我们的数据集中探索了六种单段神经网络模型和五种脂肪肝指数对 NAFLD 检测的性能。我们还通过逻辑回归进一步评估了参与者特征对 2S-NNet 正确性的影响。
2S-NNet 对肝脂肪变性的曲线下面积(AUROC)分别为≥轻度、≥中度和重度的 0.90、0.85 和 0.93,对 NAFLD 存在的 AUROC 为 0.90、对中度至重度 NAFLD 的 AUROC 为 0.84、对重度 NAFLD 的 AUROC 为 0.93。2S-NNet 的 NAFLD 严重程度 AUROC 为 0.88,而单段模型的 AUROC 为 0.79-0.86。2S-NNet 对 NAFLD 存在的 AUROC 为 0.90,而脂肪肝指数的 AUROC 为 0.54-0.82。年龄、性别、体重指数、糖尿病、纤维化-4 指数、安卓脂肪比和双能 X 射线吸收法测定的骨骼肌对 2S-NNet 的正确性没有显著影响(p>0.05)。
通过使用两段式设计,2S-NNet 提高了检测 NAFLD 的性能,具有更高的可解释性和临床相关性。
基于放射科医生共识的回顾性研究,我们的 DLS(2S-NNet)使用两段式设计的 AUROC 为 0.88,其检测 NAFLD 的性能优于使用单段式设计,具有更高的可解释性和临床相关性。
2S-NNet 优于五种脂肪肝指数,在不同的 NAFLD 严重程度筛查中具有最高的 AUROCs(0.84-0.93),这表明基于深度学习的放射学筛查可能比血液生物标志物面板在流行病学中具有更好的筛查效用。
2S-NNet 的正确性不受个体特征的显著影响,包括年龄、性别、体重指数、糖尿病、纤维化-4 指数、安卓脂肪比和双能 X 射线吸收法测定的骨骼肌。