IEEE Trans Pattern Anal Mach Intell. 2016 Aug;38(8):1707-20. doi: 10.1109/TPAMI.2015.2496269. Epub 2015 Oct 30.
Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.
研究无结构社交环境(例如鸡尾酒会)中的独立对话群体(FCG)是令人满意的,因为在群体(挖掘社交网络)和个体(识别本地行为和个性特征)层面都有丰富的信息可供利用。然而,由于提取行为线索(例如目标位置、说话活动和头部/身体姿势)存在困难,分析涉及 FCG 的社交场景也极具挑战性,这是因为场景拥挤且存在极端遮挡。为此,我们提出了 SALSA,这是一个促进多模态和协同社会场景分析的新型数据集,并在自动化社会互动分析方面做出了两个主要贡献:(1)SALSA 在海报展示和鸡尾酒会背景下,记录了 18 名参与者在自然室内环境中的超过 60 分钟的社会互动,该环境存在低分辨率图像、光照变化、大量遮挡、混响和干扰声源等困难;(2)为了缓解这些问题,我们通过使用四个静态监控摄像机和每个参与者佩戴的社会计量徽章(包括麦克风、加速度计、蓝牙和红外传感器)记录社交互动,从而促进多模态分析。除了原始数据,我们还提供了有关个人个性以及他们在整个事件期间的位置、头部、身体方向和 F 形成信息的注释。通过与最先进方法的广泛实验,我们展示了(a)当前方法的局限性,以及(b)记录的多个线索如何协同帮助自动分析社会互动。SALSA 可在 http://tev.fbk.eu/salsa 上获得。