• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

二次相似性查询的压缩

Compression for Quadratic Similarity Queries.

作者信息

Ingber Amir, Courtade Thomas, Weissman Tsachy

机构信息

Department of Electrical Engineering, Stanford University, Stanford, CA 94305 USA.

出版信息

IEEE Trans Inf Theory. 2015 May;61(5):2729-2747. doi: 10.1109/tit.2015.2402972. Epub 2015 Feb 11.

DOI:10.1109/tit.2015.2402972
PMID:29375151
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5786438/
Abstract

The problem of performing similarity queries on compressed data is considered. We focus on the quadratic similarity measure, and study the fundamental tradeoff between compression rate, sequence length, and reliability of queries performed on the compressed data. For a Gaussian source, we show that the queries can be answered reliably if and only if the compression rate exceeds a given threshold-the - which we explicitly characterize. Moreover, when compression is performed at a rate greater than the identification rate, responses to queries on the compressed data can be made exponentially reliable. We give a complete characterization of this exponent, which is analogous to the error and excess-distortion exponents in channel and source coding, respectively. For a general source, we prove that, as with classical compression, the Gaussian source requires the largest compression rate among sources with a given variance. Moreover, a robust scheme is described that attains this maximal rate for any source distribution.

摘要

考虑了对压缩数据执行相似性查询的问题。我们专注于二次相似性度量,并研究压缩率、序列长度以及对压缩数据执行查询的可靠性之间的基本权衡。对于高斯源,我们表明,当且仅当压缩率超过给定阈值时,查询才能可靠地得到回答,我们明确刻画了该阈值。此外,当以大于识别率的速率进行压缩时,对压缩数据的查询响应可以具有指数可靠性。我们对该指数进行了完整的刻画,它分别类似于信道编码和源编码中的错误指数和超额失真指数。对于一般源,我们证明,与经典压缩一样,在具有给定方差的源中,高斯源需要最大的压缩率。此外,还描述了一种鲁棒方案,该方案对于任何源分布都能达到这个最大速率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/2704fb4c0b39/nihms910307f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/4880dd0ad7c4/nihms910307f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/503b6531b5a2/nihms910307f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/6ea70a924f4f/nihms910307f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/40ac6a01fe9a/nihms910307f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/e7f658d92909/nihms910307f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/2704fb4c0b39/nihms910307f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/4880dd0ad7c4/nihms910307f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/503b6531b5a2/nihms910307f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/6ea70a924f4f/nihms910307f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/40ac6a01fe9a/nihms910307f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/e7f658d92909/nihms910307f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d749/5786438/2704fb4c0b39/nihms910307f6.jpg

相似文献

1
Compression for Quadratic Similarity Queries.二次相似性查询的压缩
IEEE Trans Inf Theory. 2015 May;61(5):2729-2747. doi: 10.1109/tit.2015.2402972. Epub 2015 Feb 11.
2
Compression for Similarity Identification: Computing the Error Exponent.用于相似性识别的压缩:计算误差指数。
Proc Data Compress Conf. 2015 Apr;2015:413-422. doi: 10.1109/DCC.2015.75. Epub 2015 Jul 6.
3
Compression for Quadratic Similarity Queries: Finite Blocklength and Practical Schemes.二次相似性查询的压缩:有限码长与实用方案
IEEE Trans Inf Theory. 2016 May;62(5):2737-2747. doi: 10.1109/TIT.2016.2535172. Epub 2016 Feb 26.
4
How Does ChatGPT Use Source Information Compared With Google? A Text Network Analysis of Online Health Information.ChatGPT 与谷歌相比如何使用来源信息?在线健康信息的文本网络分析。
Clin Orthop Relat Res. 2024 Apr 1;482(4):578-588. doi: 10.1097/CORR.0000000000002995. Epub 2024 Mar 1.
5
Exponential Strong Converse for Source Coding with Side Information at the Decoder.
Entropy (Basel). 2018 May 8;20(5):352. doi: 10.3390/e20050352.
6
Asymptotic Rate-Distortion Analysis of Symmetric Remote Gaussian Source Coding: Centralized Encoding vs. Distributed Encoding.对称远程高斯源编码的渐近率失真分析:集中式编码与分布式编码
Entropy (Basel). 2019 Feb 23;21(2):213. doi: 10.3390/e21020213.
7
Analysis on Optimal Error Exponents of Binary Classification for Source with Multiple Subclasses.具有多个子类的源的二元分类最优误差指数分析
Entropy (Basel). 2022 Apr 30;24(5):635. doi: 10.3390/e24050635.
8
Network Compression: Worst Case Analysis.网络压缩:最坏情况分析。
IEEE Trans Inf Theory. 2015 Jul;61(7):3980-3995. doi: 10.1109/tit.2015.2434829. Epub 2015 Jun 12.
9
Trade-offs between Error Exponents and Excess-Rate Exponents of Typical Slepian-Wolf Codes.典型斯莱皮恩 - 沃尔夫码的错误指数与超额速率指数之间的权衡
Entropy (Basel). 2021 Feb 24;23(3):265. doi: 10.3390/e23030265.
10
A joint source-channel distortion model for JPEG compressed images.一种用于JPEG压缩图像的联合信源-信道失真模型。
IEEE Trans Image Process. 2006 Jun;15(6):1349-64. doi: 10.1109/tip.2006.871118.

引用本文的文献

1
Compression for Quadratic Similarity Queries: Finite Blocklength and Practical Schemes.二次相似性查询的压缩:有限码长与实用方案
IEEE Trans Inf Theory. 2016 May;62(5):2737-2747. doi: 10.1109/TIT.2016.2535172. Epub 2016 Feb 26.

本文引用的文献

1
Achievable Rates for Pattern Recognition.模式识别的可实现速率
IEEE Trans Inf Theory. 2008 Jan;54(1):299-320. doi: 10.1109/tit.2007.911296. Epub 2008 Jan 4.
2
Compression for Quadratic Similarity Queries: Finite Blocklength and Practical Schemes.二次相似性查询的压缩:有限码长与实用方案
IEEE Trans Inf Theory. 2016 May;62(5):2737-2747. doi: 10.1109/TIT.2016.2535172. Epub 2016 Feb 26.