Fu Dongqi, Bao Wenxuan, Maciejewski Ross, Tong Hanghang, He Jingrui
University of Illinois Urbana-Champaign.
Arizona State University.
SIGKDD Explor. 2023 Jul 5;25(1):54-72. doi: 10.1145/3606274.3606280.
In graph machine learning, data collection, sharing, and analysis often involve multiple parties, each of which may require varying levels of data security and privacy. To this end, preserving privacy is of great importance in protecting sensitive information. In the era of big data, the relationships among data entities have become unprecedentedly complex, and more applications utilize advanced data structures (i.e., graphs) that can support network structures and relevant attribute information. To date, many graph-based AI models have been proposed (e.g., graph neural networks) for various domain tasks, like computer vision and natural language processing. In this paper, we focus on reviewing privacy-preserving techniques of graph machine learning. We systematically review related works from the data to the computational aspects. We first review methods for generating privacy-preserving graph data. Then we describe methods for transmitting privacy-preserved information (e.g., graph model parameters) to realize the optimization-based computation when data sharing among multiple parties is risky or impossible. In addition to discussing relevant theoretical methodology and software tools, we also discuss current challenges and highlight several possible future research opportunities for privacy-preserving graph machine learning. Finally, we envision a unified and comprehensive secure graph machine learning system.
在图机器学习中,数据收集、共享和分析通常涉及多个参与方,每个参与方可能需要不同级别的数据安全性和隐私性。为此,保护隐私对于保护敏感信息至关重要。在大数据时代,数据实体之间的关系变得前所未有的复杂,并且更多的应用使用能够支持网络结构和相关属性信息的先进数据结构(即图)。迄今为止,已经针对各种领域任务(如计算机视觉和自然语言处理)提出了许多基于图的人工智能模型(如图神经网络)。在本文中,我们专注于回顾图机器学习的隐私保护技术。我们从数据到计算方面系统地回顾相关工作。我们首先回顾生成隐私保护图数据的方法。然后我们描述传输隐私保护信息(如图模型参数)的方法,以便在多方之间的数据共享存在风险或无法进行时实现基于优化的计算。除了讨论相关的理论方法和软件工具外,我们还讨论当前的挑战,并突出隐私保护图机器学习的几个未来可能的研究机会。最后,我们设想一个统一且全面的安全图机器学习系统。