Yi Yunai, Sun Diya, Li Peixin, Kim Tae-Kyun, Xu Tianmin, Pei Yuru
Key Laboratory of Machine Perception (MOE), Department of Machine Intelligence, Peking University, Beijing, 100871 China.
Department of Electrical and Electronic Engineering, Imperial College London, London, UK.
Comput Vis Media (Beijing). 2022;8(2):257-272. doi: 10.1007/s41095-021-0241-9. Epub 2021 Dec 6.
This paper presents an unsupervised clustering random-forest-based metric for affinity estimation in large and high-dimensional data. The criterion used for node splitting during forest construction can handle rank-deficiency when measuring cluster compactness. The binary forest-based metric is extended to continuous metrics by exploiting both the common traversal path and the smallest shared parent node. The proposed forest-based metric efficiently estimates affinity by passing down data pairs in the forest using a limited number of decision trees. A pseudo-leaf-splitting (PLS) algorithm is introduced to account for spatial relationships, which regularizes affinity measures and overcomes inconsistent leaf assign-ments. The random-forest-based metric with PLS facilitates the establishment of consistent and point-wise correspondences. The proposed method has been applied to automatic phrase recognition using color and depth videos and point-wise correspondence. Extensive experiments demonstrate the effectiveness of the proposed method in affinity estimation in a comparison with the state-of-the-art.
本文提出了一种基于无监督聚类随机森林的度量方法,用于在大型高维数据中估计亲和度。在森林构建过程中用于节点分裂的准则在测量聚类紧凑性时可以处理秩亏问题。基于二元森林的度量通过利用公共遍历路径和最小共享父节点扩展为连续度量。所提出的基于森林的度量通过在森林中使用有限数量的决策树向下传递数据对来有效地估计亲和度。引入了一种伪叶分裂(PLS)算法来考虑空间关系,该算法规范了亲和度度量并克服了不一致的叶分配问题。带有PLS的基于随机森林的度量有助于建立一致的逐点对应关系。所提出的方法已应用于使用彩色和深度视频以及逐点对应关系的自动短语识别。大量实验表明,与现有技术相比,所提出的方法在亲和度估计方面是有效的。