Suppr超能文献

一种用于模拟识别技术有效性的标度律。

A scaling law to model the effectiveness of identification techniques.

作者信息

Rocher Luc, Hendrickx Julien M, Montjoye Yves-Alexandre de

机构信息

Oxford Internet Institute, University of Oxford, Oxford, UK.

Information and Communication Technologies, Electronics and Applied Mathematics (ICTEAM), Université catholique de Louvain, Louvain-la-Neuve, Belgium.

出版信息

Nat Commun. 2025 Jan 9;16(1):347. doi: 10.1038/s41467-024-55296-6.

Abstract

AI techniques are increasingly being used to identify individuals both offline and online. However, quantifying their effectiveness at scale and, by extension, the risks they pose remains a significant challenge. Here, we propose a two-parameter Bayesian model for exact matching techniques and derive an analytical expression for correctness (κ), the fraction of people accurately identified in a population. We then generalize the model to forecast how κ scales from small-scale experiments to the real world, for exact, sparse, and machine learning-based robust identification techniques. Despite having only two degrees of freedom, our method closely fits 476 correctness curves and strongly outperforms curve-fitting methods and entropy-based rules of thumb. Our work provides a principled framework for forecasting the privacy risks posed by identification techniques, while also supporting independent accountability efforts for AI-based biometric systems.

摘要

人工智能技术越来越多地被用于离线和在线识别个体。然而,量化其大规模应用时的有效性以及由此带来的风险仍然是一项重大挑战。在此,我们针对精确匹配技术提出了一种双参数贝叶斯模型,并推导出正确性(κ)的解析表达式,即总体中被准确识别的人群比例。然后,我们将该模型进行推广,以预测κ如何从小规模实验扩展到现实世界,适用于精确、稀疏和基于机器学习的鲁棒识别技术。尽管只有两个自由度,但我们的方法紧密拟合了476条正确性曲线,并且明显优于曲线拟合方法和基于熵的经验法则。我们的工作为预测识别技术带来的隐私风险提供了一个有原则的框架,同时也支持基于人工智能的生物识别系统的独立问责工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95a1/11718298/5eb997146861/41467_2024_55296_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验