IEEE Trans Neural Syst Rehabil Eng. 2019 Aug;27(8):1546-1555. doi: 10.1109/TNSRE.2019.2926965. Epub 2019 Jul 4.
Machine learning (ML) is revolutionizing research and industry. Many ML applications rely on the use of large amounts of personal data for training and inference. Among the most intimate exploited data sources is electroencephalogram (EEG) data, a kind of data that is so rich with information that application developers can easily gain knowledge beyond the professed scope from unprotected EEG signals, including passwords, ATM PINs, and other intimate data. The challenge we address is how to engage in meaningful ML with EEG data while protecting the privacy of users. Hence, we propose cryptographic protocols based on secure multiparty computation (SMC) to perform linear regression over EEG signals from many users in a fully privacy-preserving (PP) fashion, i.e., such that each individual's EEG signals are not revealed to anyone else. To illustrate the potential of our secure framework, we show how it allows estimating the drowsiness of drivers from their EEG signals as would be possible in the unencrypted case, and at a very reasonable computational cost. Our solution is the first application of commodity-based SMC to EEG data, as well as the largest documented experiment of secret sharing-based SMC in general, namely, with 15 players involved in all the computations.
机器学习 (ML) 正在彻底改变研究和工业界。许多 ML 应用程序依赖于大量个人数据进行训练和推理。在最亲密的被利用数据源中,有一种数据非常丰富,应用程序开发人员可以轻松地从未受保护的 EEG 信号中获取超出声明范围的知识,包括密码、ATM PIN 码和其他私人数据。我们面临的挑战是如何在保护用户隐私的同时,利用 EEG 数据进行有意义的机器学习。因此,我们提出了基于安全多方计算 (SMC) 的加密协议,以完全隐私保护 (PP) 的方式对来自多个用户的 EEG 信号进行线性回归,即每个人的 EEG 信号都不会向其他人透露。为了说明我们安全框架的潜力,我们展示了它如何能够像在未加密情况下一样,从 EEG 信号中估计驾驶员的困倦程度,并且计算成本非常合理。我们的解决方案是将基于商品的 SMC 首次应用于 EEG 数据,也是一般情况下基于秘密共享的 SMC 的最大文档记录实验,即所有计算都涉及 15 个参与者。