Khanykov Igor, Nenashev Vadim, Kharinov Mikhail
Laboratory of Big Data Technologies for Sociocyberphysical Systems, St. Petersburg Federal Research Center of the Russian Academy of Sciences, 14 Line V. O. 39, 199178 Saint Petersburg, Russia.
Laboratory of Intelligent Technologies and Modelling of Complex Systems, Institute of Computing Systems and Programming, Saint Petersburg State University of Aerospace Instrumentation, 67 B. Morskaia St., 190000 Saint Petersburg, Russia.
J Imaging. 2023 Jul 18;9(7):146. doi: 10.3390/jimaging9070146.
The paper refers to interdisciplinary research in the areas of hierarchical cluster analysis of big data and ordering of primary data to detect objects in a color or in a grayscale image. To perform this on a limited domain of multidimensional data, an NP-hard problem of calculation of close to piecewise constant data approximations with the smallest possible standard deviations or total squared errors () is solved. The solution is achieved by revisiting, modernizing, and combining classical Ward's clustering, split/merge, and K-means methods. The concepts of objects, images, and their elements () are formalized as structures that are distinguishable from each other. The results of structuring and ordering the image data are presented to the user in two ways, as tabulated approximations of the image showing the available object hierarchies. For not only theoretical reasoning, but also for practical implementation, reversible calculations with pixel sets are performed easily, as with individual pixels in terms of Sleator-Tarjan Dynamic trees and cyclic graphs forming an Algebraic Multi-Layer Network (AMN). The detailing of the latter significantly distinguishes this paper from our prior works. The establishment of the invariance of detected objects with respect to changing the context of the image and its transformation into grayscale is also new.
本文涉及大数据层次聚类分析和原始数据排序领域的跨学科研究,以检测彩色或灰度图像中的对象。为了在多维数据的有限域上执行此操作,解决了一个NP难问题,即计算具有尽可能小的标准差或总平方误差()的接近分段常数的数据近似值。通过重新审视、现代化和结合经典的沃德聚类、分裂/合并和K均值方法来实现解决方案。对象、图像及其元素()的概念被形式化为彼此可区分的结构。图像数据的结构化和排序结果以两种方式呈现给用户,即作为显示可用对象层次结构的图像的表格近似值。不仅出于理论推理,而且为了实际实现,像素集的可逆计算很容易执行,就像在Sleator-Tarjan动态树和形成代数多层网络(AMN)的循环图方面对单个像素进行计算一样。后者的详细阐述使本文与我们之前的作品有显著区别。检测到的对象相对于图像上下文变化及其转换为灰度的不变性的建立也是新的。