Yun Jian, Lu Yusheng, Liu Xinyang, Guan Jingdan
Dalian Minzu University, College of Computer Science and Engineering, Dalian, Liaoning, China.
PeerJ Comput Sci. 2024 Sep 9;10:e2268. doi: 10.7717/peerj-cs.2268. eCollection 2024.
The increased use of artificial intelligence generated content (AIGC) among vast user populations has heightened the risk of private data leaks. Effective auditing and regulation remain challenging, further compounding the risks associated with the leaks involving model parameters and user data. Blockchain technology, renowned for its decentralized consensus mechanism and tamper-resistant properties, is emerging as an ideal tool for documenting, auditing, and analyzing the behaviors of all stakeholders in machine learning as a service (MLaaS). This study centers on biometric recognition systems, addressing pressing privacy and security concerns through innovative endeavors. We conducted experiments to analyze six distinct deep neural networks, leveraging a dataset quality metric grounded in the query output space to quantify the value of the transfer datasets. This analysis revealed the impact of imbalanced datasets on training accuracy, thereby bolstering the system's capacity to detect model data thefts. Furthermore, we designed and implemented a novel Bio-Rollup scheme, seamlessly integrating technologies such as certificate authority, blockchain layer two scaling, and zero-knowledge proofs. This innovative scheme facilitates lightweight auditing through Merkle proofs, enhancing efficiency while minimizing blockchain storage requirements. Compared to the baseline approach, Bio-Rollup restores the integrity of the biometric system and simplifies deployment procedures. It effectively prevents unauthorized use through certificate authorization and zero-knowledge proofs, thus safeguarding user privacy and offering a passive defense against model stealing attacks.
在大量用户群体中,人工智能生成内容(AIGC)的使用增加,加大了私人数据泄露的风险。有效的审计和监管仍然具有挑战性,这进一步加剧了与涉及模型参数和用户数据泄露相关的风险。区块链技术以其去中心化的共识机制和抗篡改特性而闻名,正成为记录、审计和分析机器学习即服务(MLaaS)中所有利益相关者行为的理想工具。本研究以生物识别系统为核心,通过创新努力解决紧迫的隐私和安全问题。我们进行了实验,以分析六个不同的深度神经网络,利用基于查询输出空间的数据集质量指标来量化转移数据集的价值。该分析揭示了不平衡数据集对训练准确性的影响,从而增强了系统检测模型数据盗窃的能力。此外,我们设计并实施了一种新颖的生物汇总方案,无缝集成了证书颁发机构、区块链二层扩展和零知识证明等技术。这种创新方案通过默克尔证明促进了轻量级审计,提高了效率,同时将区块链存储需求降至最低。与基线方法相比,生物汇总恢复了生物识别系统的完整性并简化了部署程序。它通过证书授权和零知识证明有效地防止了未经授权的使用,从而保护了用户隐私,并为模型窃取攻击提供了被动防御。