使用基于自注意力的模型可变视觉变换器（vViT）在CT上预测肾细胞癌根治性肾切除或部分肾切除术后的表皮生长因子受体（EGFR）状态

Predicting EGFR Status After Radical Nephrectomy or Partial Nephrectomy for Renal Cell Carcinoma on CT Using a Self-attention-based Model: Variable Vision Transformer (vViT).

作者信息

Usuzaki Takuma, Inamori Ryusei, Ishikuro Mami, Obara Taku, Takaya Eichi, Homma Noriyasu, Takase Kei

机构信息

Department of Diagnostic Radiology, Tohoku University Hospital, Sendai, Japan.

Tohoku University Hospital, 1-1 Seiryo-Machi, Aoba-Ku, Sendai, Miyagi, 980-8574, Japan.

出版信息

J Imaging Inform Med. 2024 Dec;37(6):3057-3069. doi: 10.1007/s10278-024-01180-0. Epub 2024 Jun 28.

DOI:10.1007/s10278-024-01180-0

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11612086/

Abstract

OBJECTIVE

To assess the effectiveness of the vViT model for predicting postoperative renal function decline by leveraging clinical data, medical images, and image-derived features; and to identify the most dominant factor influencing this prediction.

MATERIALS AND METHODS

We developed two models, eGFR10 and eGFR20, to identify patients with a postoperative reduction in eGFR of more than 10 and more than 20, respectively, among renal cell carcinoma patients. The eGFR10 model was trained on 75 patients and tested on 27, while the eGFR20 model was trained on 77 patients and tested on 24. The vViT model inputs included class token, patient characteristics (age, sex, BMI), comorbidities (peripheral vascular disease, diabetes, liver disease), habits (smoking, alcohol), surgical details (ischemia time, blood loss, type and procedure of surgery, approach, operative time), radiomics, and tumor and kidney imaging. We used permutation feature importance to evaluate each sector's contribution. The performance of vViT was compared with CNN models, including VGG16, ResNet50, and DenseNet121, using McNemar and DeLong tests.

RESULTS

The eGFR10 model achieved an accuracy of 0.741 and an AUC-ROC of 0.692, while the eGFR20 model attained an accuracy of 0.792 and an AUC-ROC of 0.812. The surgical and radiomics sectors were the most influential in both models. The vViT had higher accuracy and AUC-ROC than VGG16 and ResNet50, and higher AUC-ROC than DenseNet121 (p < 0.05). Specifically, the vViT did not have a statistically different AUC-ROC compared to VGG16 (p = 1.0) and ResNet50 (p = 0.7) but had a statistically different AUC-ROC compared to DenseNet121 (p = 0.87) for the eGFR10 model. For the eGFR20 model, the vViT did not have a statistically different AUC-ROC compared to VGG16 (p = 0.72), ResNet50 (p = 0.88), and DenseNet121 (p = 0.64).

CONCLUSION

The vViT model, a transformer-based approach for multimodal data, shows promise for preoperative CT-based prediction of eGFR status in patients with renal cell carcinoma.

摘要

目的

通过利用临床数据、医学图像和图像衍生特征，评估vViT模型预测术后肾功能下降的有效性；并确定影响该预测的最主要因素。

材料与方法

我们开发了两个模型，即eGFR10和eGFR20，分别用于识别肾细胞癌患者中术后估算肾小球滤过率（eGFR）下降超过10和超过20的患者。eGFR10模型在75例患者上进行训练，并在27例患者上进行测试，而eGFR20模型在77例患者上进行训练，并在24例患者上进行测试。vViT模型的输入包括类别令牌、患者特征（年龄、性别、体重指数）、合并症（外周血管疾病、糖尿病、肝病）、习惯（吸烟、饮酒）、手术细节（缺血时间、失血量、手术类型和步骤、入路、手术时间）、放射组学以及肿瘤和肾脏成像。我们使用排列特征重要性来评估每个部分的贡献。使用McNemar检验和DeLong检验将vViT的性能与包括VGG-16、ResNet50和DenseNet121在内的卷积神经网络（CNN）模型进行比较。

结果

eGFR10模型的准确率为0.741，曲线下面积（AUC-ROC）为0.692，而eGFR20模型的准确率为0.792，AUC-ROC为0.812。手术和放射组学部分在两个模型中影响最大。vViT的准确率和AUC-ROC高于VGG-16和ResNet50，AUC-ROC高于DenseNet121（p<0.05）。具体而言，对于eGFR10模型，vViT与VGG-16（p=1.0）和ResNet50（p=0.7）相比，AUC-ROC无统计学差异，但与DenseNet121相比，AUC-ROC有统计学差异（p=0.87）。对于eGFR20模型，vViT与VGG-16（p=0.72）、ResNet50（p=0.88）和DenseNet121（p=0.64）相比，AUC-ROC无统计学差异。

结论

vViT模型是一种基于变换器的多模态数据方法，在基于术前计算机断层扫描（CT）预测肾细胞癌患者的eGFR状态方面显示出前景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75b5/11612086/2f28d80b4206/10278_2024_1180_Fig1a_HTML.jpg