Kim Hannah, Drake Barry, Endert Alex, Park Haesun
IEEE Trans Vis Comput Graph. 2021 Sep;27(9):3644-3655. doi: 10.1109/TVCG.2020.2981456. Epub 2021 Jul 29.
Human-in-the-loop topic modeling allows users to explore and steer the process to produce better quality topics that align with their needs. When integrated into visual analytic systems, many existing automated topic modeling algorithms are given interactive parameters to allow users to tune or adjust them. However, this has limitations when the algorithms cannot be easily adapted to changes, and it is difficult to realize interactivity closely supported by underlying algorithms. Instead, we emphasize the concept of tight integration, which advocates for the need to co-develop interactive algorithms and interactive visual analytic systems in parallel to allow flexibility and scalability. In this article, we describe design goals for efficiently and effectively executing the concept of tight integration among computation, visualization, and interaction for hierarchical topic modeling of text data. We propose computational base operations for interactive tasks to achieve the design goals. To instantiate our concept, we present ArchiText, a prototype system for interactive hierarchical topic modeling, which offers fast, flexible, and algorithmically valid analysis via tight integration. Utilizing interactive hierarchical topic modeling, our technique lets users generate, explore, and flexibly steer hierarchical topics to discover more informed topics and their document memberships.
人在回路中的主题建模允许用户探索和引导该过程,以生成符合其需求的更高质量主题。当集成到视觉分析系统中时,许多现有的自动主题建模算法都被赋予了交互式参数,以允许用户对其进行调整。然而,当算法不易适应变化时,这存在局限性,并且难以实现由底层算法紧密支持的交互性。相反,我们强调紧密集成的概念,它主张需要并行共同开发交互式算法和交互式视觉分析系统,以实现灵活性和可扩展性。在本文中,我们描述了用于高效且有效地执行文本数据分层主题建模中计算、可视化和交互之间紧密集成概念的设计目标。我们提出了用于交互式任务的计算基本操作,以实现这些设计目标。为了实例化我们的概念,我们展示了ArchiText,一个用于交互式分层主题建模的原型系统,它通过紧密集成提供快速、灵活且算法有效的分析。利用交互式分层主题建模,我们的技术让用户生成、探索并灵活引导分层主题,以发现更有价值的主题及其文档归属关系。