Heo Seokjae, Na Seunguk
Department of Architectural Engineering, School of Architecture, Dankook University, Yongin-si, Gyeonggi-do, South Korea.
Waste Manag Res. 2025 Jul;43(7):1048-1059. doi: 10.1177/0734242X241290743. Epub 2024 Nov 18.
The escalating volume of construction activities and resultant waste generation underscores the imperative for developing sophisticated segmentation models to facilitate efficient sorting and recycling processes. This study introduces WasteSAM, an enhanced iteration of the segment anything model (SAM), specifically tailored to address the intricate complexities inherent in construction waste imagery. Drawing upon a comprehensive dataset comprising over 15,000 masks representing five distinct categories of construction materials, WasteSAM exhibits notably superior segmentation capabilities. Quantitative analysis demonstrates significant performance improvements, with WasteSAM outperforming the original SAM model by an average of 23.9% in dice similarity coefficient and 30.0% in normalized surface distance metrics. The integration of stereo-image techniques in refining the training dataset has facilitated WasteSAM in more accurately discerning the three-dimensional structure of waste materials, thereby augmenting the precision of waste classification. Noteworthy is the model's adeptness in handling intricate textures and patterns across diverse imaging modalities, including varying lighting conditions and complex object interactions. While showing promising results, this study also highlights the need for high-quality, diverse datasets that reflect real-world construction site complexities, rather than merely larger datasets.
建筑活动数量的不断增加以及由此产生的废物生成凸显了开发复杂分割模型以促进高效分类和回收过程的必要性。本研究介绍了WasteSAM,这是分割一切模型(SAM)的增强迭代版本,专门针对建筑垃圾图像中固有的复杂复杂性而设计。基于一个包含超过15000个代表五种不同建筑材料类别的掩码的综合数据集,WasteSAM展现出显著优越的分割能力。定量分析表明其性能有显著提升,在骰子相似系数方面,WasteSAM比原始SAM模型平均高出23.9%,在归一化表面距离指标方面高出30.0%。立体图像技术在优化训练数据集中的整合有助于WasteSAM更准确地识别废料的三维结构,从而提高废物分类的精度。值得注意的是,该模型善于处理各种成像模式下的复杂纹理和图案,包括不同的光照条件和复杂的物体相互作用。虽然显示出了有前景的结果,但本研究也强调需要高质量、多样化的数据集来反映真实世界建筑工地的复杂性,而不仅仅是更大的数据集。