• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

高维数据的可扩展最近邻算法。

Scalable Nearest Neighbor Algorithms for High Dimensional Data.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2227-40. doi: 10.1109/TPAMI.2014.2321376.

DOI:10.1109/TPAMI.2014.2321376
PMID:26353063
Abstract

For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.

摘要

对于许多计算机视觉和机器学习问题来说,大的训练集是取得良好性能的关键。然而,许多计算机视觉和机器学习算法中最耗费计算资源的部分包括查找与表示训练数据的高维向量的最近邻匹配。我们提出了用于近似最近邻匹配的新算法,并对其进行了评估和与以前的算法进行了比较。对于匹配高维特征,我们发现两种算法最为有效:随机 k-d 森林和本文提出的一种新算法,优先级搜索 k-均值树。我们还提出了一种用于匹配二进制特征的新算法,通过搜索多个层次聚类树,并展示了它优于文献中常用的方法。我们表明,最优的最近邻算法及其参数取决于数据集的特征,并描述了一种自动配置过程,用于查找搜索特定数据集的最佳算法。为了扩展到非常大的数据集,否则它们将无法在单个机器的内存中容纳,我们提出了一个分布式最近邻匹配框架,可以与本文中描述的任何算法一起使用。所有这些研究都已作为一个名为快速近似最近邻库(FLANN)的开源库发布,该库已被集成到 OpenCV 中,现在是最流行的最近邻匹配库之一。

相似文献

1
Scalable Nearest Neighbor Algorithms for High Dimensional Data.高维数据的可扩展最近邻算法。
IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2227-40. doi: 10.1109/TPAMI.2014.2321376.
2
K-nearest neighbor finding using MaxNearestDist.使用最大最近距离进行K近邻查找。
IEEE Trans Pattern Anal Mach Intell. 2008 Feb;30(2):243-52. doi: 10.1109/TPAMI.2007.1182.
3
Fast Localization in Large-Scale Environments Using Supervised Indexing of Binary Features.利用二进制特征的监督索引实现大规模环境中的快速本地化。
IEEE Trans Image Process. 2016 Jan;25(1):343-58. doi: 10.1109/TIP.2015.2500030. Epub 2015 Nov 11.
4
Image Geo-Localization Based on Multiple Nearest Neighbor Feature Matching Using Generalized Graphs.基于广义图的多最近邻特征匹配的图像地理定位。
IEEE Trans Pattern Anal Mach Intell. 2014 Aug;36(8):1546-58. doi: 10.1109/TPAMI.2014.2299799.
5
A fast approximate nearest neighbor search algorithm in the Hamming space.汉明空间中的快速近似最近邻搜索算法。
IEEE Trans Pattern Anal Mach Intell. 2012 Dec;34(12):2481-8. doi: 10.1109/TPAMI.2012.170.
6
A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality.一种使用k均值聚类和三角不等式进行高维搜索的快速精确k近邻算法。
Proc Int Jt Conf Neural Netw. 2012 Feb 8;43(6):2351-2358. doi: 10.1016/j.patcog.2010.01.003.
7
Asymmetric Mapping Quantization for Nearest Neighbor Search.用于最近邻搜索的非对称映射量化
IEEE Trans Pattern Anal Mach Intell. 2020 Jul;42(7):1783-1790. doi: 10.1109/TPAMI.2019.2925347. Epub 2019 Jun 27.
8
Fast exact nearest patch matching for patch-based image editing and processing.基于面片的图像编辑和处理的快速精确最近邻面片匹配。
IEEE Trans Vis Comput Graph. 2011 Aug;17(8):1122-34. doi: 10.1109/TVCG.2010.226.
9
NV-tree: an efficient disk-based index for approximate search in very large high-dimensional collections.NV树:一种用于在超大型高维数据集中进行近似搜索的高效基于磁盘的索引。
IEEE Trans Pattern Anal Mach Intell. 2009 May;31(5):869-83. doi: 10.1109/TPAMI.2008.130.
10
Scalable large-margin Mahalanobis distance metric learning.可扩展的大间隔马氏距离度量学习
IEEE Trans Neural Netw. 2010 Sep;21(9):1524-30. doi: 10.1109/TNN.2010.2052630. Epub 2010 Aug 12.

引用本文的文献

1
Cognitively-plausible reinforcement learning in epidemiological agent-based simulations.基于代理的流行病学模拟中认知合理的强化学习
Front Epidemiol. 2025 Jul 28;5:1563731. doi: 10.3389/fepid.2025.1563731. eCollection 2025.
2
Telescope indexing for k-nearest neighbor search algorithms over high dimensional data & large data sets.用于高维数据和大数据集上的k近邻搜索算法的望远镜索引
Sci Rep. 2025 Jul 9;15(1):24788. doi: 10.1038/s41598-025-09856-5.
3
Spatial-Aware and Load Balancing Distributed Data Partitioning Strategies for Content-Based Multimedia Retrieval.
用于基于内容的多媒体检索的空间感知与负载均衡分布式数据分区策略
Res Sq. 2024 Sep 24:rs.3.rs-4973077. doi: 10.21203/rs.3.rs-4973077/v1.
4
Deep Deterministic Policy Gradient-Based Resource Allocation Considering Network Slicing and Device-to-Device Communication in Mobile Networks.移动网络中基于深度确定性策略梯度的考虑网络切片和设备到设备通信的资源分配
Sensors (Basel). 2024 Sep 20;24(18):6079. doi: 10.3390/s24186079.
5
A novel 3D LiDAR deep learning approach for uncrewed vehicle odometry.一种用于无人车辆里程计的新型三维激光雷达深度学习方法。
PeerJ Comput Sci. 2024 Jul 17;10:e2189. doi: 10.7717/peerj-cs.2189. eCollection 2024.
6
Adaptive robotic system for the inspection of aerospace slat actuator mount.用于航空航天缝翼致动器支架检查的自适应机器人系统。
Front Robot AI. 2024 Jun 27;11:1423319. doi: 10.3389/frobt.2024.1423319. eCollection 2024.
7
Fast and exact fixed-radius neighbor search based on sorting.基于排序的快速精确固定半径邻域搜索。
PeerJ Comput Sci. 2024 Mar 29;10:e1929. doi: 10.7717/peerj-cs.1929. eCollection 2024.
8
Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare.佐什:一种利用 Shapley 加法值的新型特征选择方法,适用于医疗保健领域的机器学习应用。
Pac Symp Biocomput. 2024;29:81-95.
9
Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library.超快速、准确的电子电离质谱匹配,用于具有百万级规模的化合物鉴定。
Nat Commun. 2023 Jun 22;14(1):3722. doi: 10.1038/s41467-023-39279-7.
10
Online Multivariate Anomaly Detection and Localization for High-Dimensional Settings.在线多维异常检测和高维设置的本地化。
Sensors (Basel). 2022 Oct 28;22(21):8264. doi: 10.3390/s22218264.