Suppr超能文献

通过优化条形码表示增强图像检索。

Enhancing image retrieval through optimal barcode representation.

作者信息

Khosrowshahli Rasa, Kheiri Farnaz, Asilian Bidgoli Azam, Tizhoosh H R, Makrehchi Masoud, Rahnamayan Shahryar

机构信息

Faculty of Mathematics and Science, Brock University, St. Catharines, ON, L2S 3A1, Canada.

Faculty of Engineering and Applied Sciences, University of Ontario Institute of Technology, Oshawa, ON, L1G 0C5, Canada.

出版信息

Sci Rep. 2025 Aug 7;15(1):28847. doi: 10.1038/s41598-025-14576-x.

Abstract

Data binary encoding has proven to be a versatile tool for optimizing data processing and memory efficiency in various machine learning applications. This includes deep barcoding, generating barcodes from deep learning feature extraction for image retrieval of similar cases among millions of indexed images. Despite the recent advancement in barcode generation methods, converting high-dimensional feature vectors (e.g., deep features) to compact and discriminative binary barcodes is still an urgent necessity and remains an unresolved problem. Difference-based binarization of features is one of the most efficient binarization methods, transforming continuous feature vectors into binary sequences and capturing trend information. However, the performance of this method is highly dependent on the ordering of the input features, leading to a significant combinatorial challenge. This research addresses this problem by optimizing feature sequences based on retrieval performance metrics. Our approach identifies optimal feature orderings, leading to substantial improvements in retrieval effectiveness compared to arbitrary or default orderings. We assess the performance of the proposed approach in various medical and non-medical image retrieval tasks. This evaluation includes medical images from The Cancer Genome Atlas (TCGA), a comprehensive publicly available dataset, as well as COVID-19 Chest X-rays dataset. In addition, we evaluate the proposed approach on non-medical benchmark image datasets, such as CIFAR-10, CIFAR-100, and Fashion-MNIST. Our findings demonstrate the importance of optimizing binary barcode representation to significantly enhance accuracy for fast image retrieval across a wide range of applications, highlighting the applicability and potential of barcodes in various domains.

摘要

数据二进制编码已被证明是一种通用工具,可用于优化各种机器学习应用中的数据处理和内存效率。这包括深度条形码技术,即从深度学习特征提取中生成条形码,用于在数百万张索引图像中检索相似病例的图像。尽管条形码生成方法最近有所进展,但将高维特征向量(例如深度特征)转换为紧凑且有区分性的二进制条形码仍然是当务之急,并且仍然是一个未解决的问题。基于差异的特征二值化是最有效的二值化方法之一,它将连续特征向量转换为二进制序列并捕获趋势信息。然而,该方法的性能高度依赖于输入特征的排序,这带来了重大的组合挑战。本研究通过基于检索性能指标优化特征序列来解决这个问题。我们的方法识别出最优的特征排序,与任意或默认排序相比,显著提高了检索效率。我们在各种医学和非医学图像检索任务中评估了所提出方法的性能。这种评估包括来自癌症基因组图谱(TCGA)的医学图像,这是一个全面的公开可用数据集,以及COVID-19胸部X光数据集。此外,我们在非医学基准图像数据集上评估了所提出的方法,如CIFAR-10、CIFAR-100和Fashion-MNIST。我们的研究结果表明,优化二进制条形码表示对于在广泛应用中显著提高快速图像检索的准确性非常重要,突出了条形码在各个领域的适用性和潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a26b/12328609/96e17725763c/41598_2025_14576_Figa_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验