文献检索，用中文搜 PubMed

PURPOSE

Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches.

METHODS AND MATERIALS

A vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects.

RESULTS

Our Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms.

CONCLUSIONS

Auto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume.

PURPOSE

METHODS AND MATERIALS

RESULTS

CONCLUSIONS

目的

人工智能辅助方法在正常组织的自动勾画方面取得了显著进展。然而，这些方法在放射治疗靶区体积的自动轮廓勾画方面存在困难。我们的目标是将靶区体积的勾画建模为一个临床决策问题，通过利用大语言模型辅助的多模态学习方法来解决。

方法和材料

我们开发了一种名为Medformer的视觉语言模型，它以分层视觉变换器为骨干，并结合大语言模型来提取富含文本的特征。通过视觉语言注意力模块，将上下文嵌入的语言特征无缝集成到视觉特征中，以进行语言感知的视觉编码。使用包括骰子相似系数（DSC）、交并比（IOU）和第95百分位数豪斯多夫距离（HD95）在内的指标来定量评估我们模型的性能。评估是在一个内部前列腺癌数据集和一个公共口咽癌数据集上进行的，共有668名受试者。

结果

在前列腺癌数据集上，我们的Medformer在勾画大体肿瘤体积时，DSC为0.81±0.10，而之前为0.72±0.10；IOU为0.73±0.12，而之前为0.65±0.09；HD95为9.86±9.77毫米，而之前为19.13±12.96毫米。同样，在口咽癌数据集上，它的DSC为0.77±0.11，而之前为0.72±0.09；IOU为0.70±0.09，而之前为0.65±0.07；HD95为7.52±4.8毫米，而之前为13.63±7.13毫米，均有显著改善（P<0.05）。对于勾画临床靶区体积，Medformer的DSC为0.91±0.04，IOU为0.85±0.05，HD95为2.98±1.60毫米，与其他先进算法相当。

结论

基于多模态学习的治疗靶区自动勾画优于单纯依赖视觉特征的传统方法。我们的方法可应用于常规实践，以快速勾画临床靶区体积/大体肿瘤体积。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用大语言模型辅助多模态学习进行放射治疗治疗靶区的自动勾画

Autodelineation of Treatment Target Volume for Radiation Therapy Using Large Language Model-Aided Multimodal Learning.

作者信息

机构信息

出版信息

PURPOSE

METHODS AND MATERIALS

RESULTS

CONCLUSIONS

相似文献

引用本文的文献

使用大语言模型辅助多模态学习进行放射治疗治疗靶区的自动勾画

Autodelineation of Treatment Target Volume for Radiation Therapy Using Large Language Model-Aided Multimodal Learning.

作者信息

机构信息

出版信息

PURPOSE

METHODS AND MATERIALS

RESULTS

CONCLUSIONS

目的

方法和材料

结果

结论

相似文献

引用本文的文献