开发 UroSAM：一种基于机器学习的模型，可自动从内镜视频中识别肾结石成分。

University of Rochester, Rochester, New York, USA.

University of Rochester Medical Center, Rochester, New York, USA.

J Endourol. 2024 Aug;38(8):748-754. doi: 10.1089/end.2023.0740. Epub 2024 May 31.

Chemical composition analysis is important in prevention counseling for kidney stone disease. Advances in laser technology have made dusting techniques more prevalent, but this offers no consistent way to collect enough material to send for chemical analysis, leading many to forgo this test. We developed a novel machine learning (ML) model to effectively assess stone composition based on intraoperative endoscopic video data. Two endourologists performed ureteroscopy for kidney stones ≥ 10 mm. Representative videos were recorded intraoperatively. Individual frames were extracted from the videos, and the stone was outlined by human tracing. An ML model, UroSAM, was built and trained to automatically identify kidney stones in the images and predict the majority stone composition as follows: calcium oxalate monohydrate (COM), dihydrate (COD), calcium phosphate (CAP), or uric acid (UA). UroSAM was built on top of the publicly available Segment Anything Model (SAM) and incorporated a U-Net convolutional neural network (CNN). A total of 78 ureteroscopy videos were collected; 50 were used for the model after exclusions (32 COM, 8 COD, 8 CAP, 2 UA). The ML model segmented the images with 94.77% precision. Dice coefficient (0.9135) and Intersection over Union (0.8496) confirmed good segmentation performance of the ML model. A video-wise evaluation demonstrated 60% correct classification of stone composition. Subgroup analysis showed correct classification in 84.4% of COM videos. A adaptive threshold technique was used to mitigate biasing of the model toward COM because of data imbalance; this improved the overall correct classification to 62% while improving the classification of COD, CAP, and UA videos. This study demonstrates the effective development of UroSAM, an ML model that precisely identifies kidney stones from natural endoscopic video data. More high-quality video data will improve the performance of the model in classifying the majority stone composition.

化学成分分析在肾结石疾病的预防咨询中很重要。激光技术的进步使得灰尘技术更为流行，但这种方法无法收集足够的材料进行化学分析，导致许多人放弃了这项测试。我们开发了一种新的机器学习（ML）模型，可以根据术中内窥镜视频数据有效地评估结石成分。两位泌尿科医生对≥10mm 的肾结石进行输尿管镜检查。术中记录有代表性的视频。从视频中提取单个帧，并通过人工追踪勾勒出结石。建立并训练一个名为 UroSAM 的 ML 模型，以自动识别图像中的肾结石并预测主要结石成分如下：一水合草酸钙（COM）、二水合草酸钙（COD）、磷酸钙（CAP）或尿酸（UA）。UroSAM 建立在公共的 Segment Anything Model（SAM）之上，并包含一个 U-Net 卷积神经网络（CNN）。共收集了 78 个输尿管镜检查视频；排除后有 50 个视频用于模型（32 个 COM、8 个 COD、8 个 CAP、2 个 UA）。该 ML 模型对图像的分割精度为 94.77%。Dice 系数（0.9135）和交并比（0.8496）证实了 ML 模型的良好分割性能。视频级评估表明，结石成分的分类准确率为 60%。亚组分析表明，COM 视频的分类准确率为 84.4%。使用自适应阈值技术减轻了模型对 COM 的偏向，因为数据不平衡；这将整体正确分类提高到 62%，同时提高了 COD、CAP 和 UA 视频的分类。这项研究展示了 UroSAM 的有效开发，这是一种从自然内窥镜视频数据中精确识别肾结石的 ML 模型。更多高质量的视频数据将提高模型对主要结石成分的分类性能。