Saleheen Nazir, Ullah Md Azim, Chakraborty Supriyo, Ones Deniz S, Srivastava Mani, Kumar Santosh
University of Memphis.
IBM T. J. Watson Research Center.
Conf Comput Commun Secur. 2021 Nov;2021:2807-2823. doi: 10.1145/3460120.3484799. Epub 2021 Nov 13.
Public release of wrist-worn motion sensor data is growing. They enable and accelerate research in developing new algorithms to passively track daily activities, resulting in improved health and wellness utilities of smartwatches and activity trackers. But, when combined with sensitive attribute inference attack and linkage attack via re-identification of the same user in multiple datasets, undisclosed sensitive attributes can be revealed to unintended organizations with potentially adverse consequences for unsuspecting data contributing users. To guide both users and data collecting researchers, we characterize the re-identification risks inherent in motion sensor data collected from wrist-worn devices in users' natural environment. For this purpose, we use an open-set formulation, train a deep learning architecture with a new loss function, and apply our model to a new data set consisting of 10 weeks of daily sensor wearing by 353 users. We find that re-identification risk increases with an increase in the activity intensity. On average, such risk is 96% for a user when sharing a full day of sensor data.
手腕佩戴式运动传感器数据的公开发布正在增加。它们有助于并加速开发新算法以被动跟踪日常活动的研究,从而提高智能手表和活动追踪器的健康与保健功能。但是,当与敏感属性推断攻击以及通过在多个数据集中重新识别同一用户的链接攻击相结合时,未公开的敏感属性可能会被泄露给意想不到的组织,这可能会对毫无戒心的数据贡献用户产生不利后果。为了指导用户和数据收集研究人员,我们描述了在用户自然环境中从手腕佩戴设备收集的运动传感器数据中固有的重新识别风险。为此,我们使用开放集公式,训练具有新损失函数的深度学习架构,并将我们的模型应用于由353名用户10周的每日传感器佩戴数据组成的新数据集。我们发现重新识别风险随着活动强度的增加而增加。平均而言,当用户共享一整天的传感器数据时,这种风险为96%。