Barnes Danielle, Polanco Luis, Perea Jose A
Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, United States.
Department of Mathematics, Michigan State University, East Lansing, MI, United States.
Front Artif Intell. 2021 Jul 28;4:681174. doi: 10.3389/frai.2021.681174. eCollection 2021.
Many and varied methods currently exist for featurization, which is the process of mapping persistence diagrams to Euclidean space, with the goal of maximally preserving structure. However, and to our knowledge, there are presently no methodical comparisons of existing approaches, nor a standardized collection of test data sets. This paper provides a comparative study of several such methods. In particular, we review, evaluate, and compare the stable multi-scale kernel, persistence landscapes, persistence images, the ring of algebraic functions, template functions, and adaptive template systems. Using these approaches for feature extraction, we apply and compare popular machine learning methods on five data sets: MNIST, Shape retrieval of non-rigid 3D Human Models (SHREC14), extracts from the Protein Classification Benchmark Collection (Protein), MPEG7 shape matching, and HAM10000 skin lesion data set. These data sets are commonly used in the above methods for featurization, and we use them to evaluate predictive utility in real-world applications.
目前存在许多不同的特征化方法,特征化是将持久图映射到欧几里得空间的过程,其目标是最大程度地保留结构。然而,据我们所知,目前尚无对现有方法的系统比较,也没有标准化的测试数据集集合。本文对几种此类方法进行了比较研究。具体而言,我们回顾、评估并比较了稳定多尺度核、持久景观、持久图像、代数函数环、模板函数和自适应模板系统。使用这些方法进行特征提取,我们在五个数据集上应用并比较了流行的机器学习方法:MNIST、非刚性3D人体模型形状检索(SHREC14)、蛋白质分类基准数据集(Protein)提取的数据、MPEG7形状匹配以及HAM10000皮肤病变数据集。这些数据集在上述特征化方法中常用,我们用它们来评估在实际应用中的预测效用。