IEEE Trans Vis Comput Graph. 2020 Jan;26(1):558-568. doi: 10.1109/TVCG.2019.2934614. Epub 2019 Aug 20.
Various domain users are increasingly leveraging real-time social media data to gain rapid situational awareness. However, due to the high noise in the deluge of data, effectively determining semantically relevant information can be difficult, further complicated by the changing definition of relevancy by each end user for different events. The majority of existing methods for short text relevance classification fail to incorporate users' knowledge into the classification process. Existing methods that incorporate interactive user feedback focus on historical datasets. Therefore, classifiers cannot be interactively retrained for specific events or user-dependent needs in real-time. This limits real-time situational awareness, as streaming data that is incorrectly classified cannot be corrected immediately, permitting the possibility for important incoming data to be incorrectly classified as well. We present a novel interactive learning framework to improve the classification process in which the user iteratively corrects the relevancy of tweets in real-time to train the classification model on-the-fly for immediate predictive improvements. We computationally evaluate our classification model adapted to learn at interactive rates. Our results show that our approach outperforms state-of-the-art machine learning models. In addition, we integrate our framework with the extended Social Media Analytics and Reporting Toolkit (SMART) 2.0 system, allowing the use of our interactive learning framework within a visual analytics system tailored for real-time situational awareness. To demonstrate our framework's effectiveness, we provide domain expert feedback from first responders who used the extended SMART 2.0 system.
越来越多的不同领域用户正在利用实时社交媒体数据来快速了解情况。然而,由于数据洪流中的高噪声,有效确定语义相关信息可能很困难,每个最终用户对不同事件的相关性定义的变化进一步增加了难度。大多数现有的短文本相关性分类方法都未能将用户的知识纳入分类过程。现有的结合交互式用户反馈的方法侧重于历史数据集。因此,分类器不能实时针对特定事件或用户相关需求进行交互式重新训练。这限制了实时态势感知能力,因为分类错误的实时流数据无法立即得到纠正,从而可能会错误地将重要的传入数据分类。我们提出了一种新颖的交互式学习框架来改进分类过程,用户可以实时迭代地纠正推文的相关性,从而实时地对分类模型进行训练,从而立即提高预测准确性。我们对我们的分类模型进行了计算评估,该模型能够以交互的速度进行学习。我们的结果表明,我们的方法优于最先进的机器学习模型。此外,我们将我们的框架与扩展的社交媒体分析和报告工具包(SMART)2.0 系统集成,允许在专门为实时态势感知量身定制的可视化分析系统中使用我们的交互式学习框架。为了展示我们框架的有效性,我们提供了使用扩展 SMART 2.0 系统的一线应急人员的领域专家反馈。