Meng Linhao, van den Elzen Stef, Pezzotti Nicola, Vilanova Anna
IEEE Trans Vis Comput Graph. 2024 Jan;30(1):164-174. doi: 10.1109/TVCG.2023.3326600. Epub 2023 Dec 25.
Data features and class probabilities are two main perspectives when, e.g., evaluating model results and identifying problematic items. Class probabilities represent the likelihood that each instance belongs to a particular class, which can be produced by probabilistic classifiers or even human labeling with uncertainty. Since both perspectives are multi-dimensional data, dimensionality reduction (DR) techniques are commonly used to extract informative characteristics from them. However, existing methods either focus solely on the data feature perspective or rely on class probability estimates to guide the DR process. In contrast to previous work where separate views are linked to conduct the analysis, we propose a novel approach, class-constrained t-SNE, that combines data features and class probabilities in the same DR result. Specifically, we combine them by balancing two corresponding components in a cost function to optimize the positions of data points and iconic representation of classes - class landmarks. Furthermore, an interactive user-adjustable parameter balances these two components so that users can focus on the weighted perspectives of interest and also empowers a smooth visual transition between varying perspectives to preserve the mental map. We illustrate its application potential in model evaluation and visual-interactive labeling. A comparative analysis is performed to evaluate the DR results.
数据特征和类别概率是例如在评估模型结果和识别有问题的项目时的两个主要视角。类别概率表示每个实例属于特定类别的可能性,这可以由概率分类器产生,甚至可以由带有不确定性的人工标注产生。由于这两个视角都是多维数据,降维(DR)技术通常用于从中提取信息特征。然而,现有方法要么仅关注数据特征视角,要么依赖类别概率估计来指导DR过程。与之前将单独视图链接起来进行分析的工作不同,我们提出了一种新颖的方法——类别约束t-SNE,它在同一个DR结果中结合了数据特征和类别概率。具体来说,我们通过在成本函数中平衡两个相应的组件来将它们结合起来,以优化数据点的位置和类别的标志性表示——类别地标。此外,一个交互式的用户可调整参数平衡这两个组件,以便用户可以专注于感兴趣的加权视角,并且还能在不同视角之间实现平滑的视觉过渡以保留心理地图。我们展示了其在模型评估和视觉交互式标注中的应用潜力。进行了对比分析以评估DR结果。