• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

局部成分复杂性:如何检测一条可供人类阅读的信息。

Local Compositional Complexity: How to Detect a Human-Readable Message.

作者信息

Mahon Louis

机构信息

School of Informatics, Edinburgh University, Edinburgh EH8 9YL, UK.

出版信息

Entropy (Basel). 2025 Mar 25;27(4):339. doi: 10.3390/e27040339.

DOI:10.3390/e27040339
PMID:40282574
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12025590/
Abstract

Data complexity is an important concept in the natural sciences and related areas, but lacks a rigorous and computable definition. This paper focusses on a particular sense of complexity that is high if the data is structured in a way that could serve to communicate a message. In this sense, human speech, written language, drawings, diagrams and photographs are high complexity, whereas data that is close to uniform throughout or populated by random values is low complexity. I describe a general framework for measuring data complexity based on dividing the shortest description of the data into a structured and an unstructured portion, and taking the size of the former as the complexity score. I outline an application of this framework in statistical mechanics that may allow a more objective characterisation of the macrostate and entropy of a physical system. Then, I derive a more precise and computable definition geared towards human communication, by proposing local compositionality as an appropriate specific structure. Experimental evaluation shows that this method can distinguish meaningful signals from noise or repetitive signals in auditory, visual and text domains, and could potentially help determine whether an extra-terrestrial signal contained a message.

摘要

数据复杂性是自然科学及相关领域中的一个重要概念,但缺乏严谨且可计算的定义。本文关注的是一种特定意义上的复杂性:如果数据的结构化方式有助于传达信息,那么这种复杂性就高。从这个意义上讲,人类语言、书面文字、绘画、图表和照片具有高复杂性,而几乎完全均匀或由随机值构成的数据则具有低复杂性。我描述了一个用于测量数据复杂性的通用框架,该框架基于将数据的最短描述划分为结构化部分和非结构化部分,并以前者的大小作为复杂性得分。我概述了此框架在统计力学中的应用,这可能使对物理系统宏观状态和熵的表征更加客观。然后,通过提出局部组合性作为一种合适的特定结构,我得出了一个更精确且可计算的、针对人类通信的定义。实验评估表明,该方法能够在听觉、视觉和文本领域中将有意义的信号与噪声或重复信号区分开来,并且有可能帮助确定外星信号是否包含信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/29921bcae116/entropy-27-00339-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/ea00417638ab/entropy-27-00339-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/d7bbad7f708f/entropy-27-00339-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/88db61f6fc50/entropy-27-00339-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/2d815efd3cd8/entropy-27-00339-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/034084611c46/entropy-27-00339-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/30192a439538/entropy-27-00339-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/efcb28762642/entropy-27-00339-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/29921bcae116/entropy-27-00339-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/ea00417638ab/entropy-27-00339-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/d7bbad7f708f/entropy-27-00339-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/88db61f6fc50/entropy-27-00339-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/2d815efd3cd8/entropy-27-00339-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/034084611c46/entropy-27-00339-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/30192a439538/entropy-27-00339-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/efcb28762642/entropy-27-00339-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2416/12025590/29921bcae116/entropy-27-00339-g008.jpg

相似文献

1
Local Compositional Complexity: How to Detect a Human-Readable Message.局部成分复杂性:如何检测一条可供人类阅读的信息。
Entropy (Basel). 2025 Mar 25;27(4):339. doi: 10.3390/e27040339.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
4
Recording human electrocorticographic (ECoG) signals for neuroscientific research and real-time functional cortical mapping.记录用于神经科学研究和实时功能性皮层图谱绘制的人类皮层脑电图(ECoG)信号。
J Vis Exp. 2012 Jun 26(64):3993. doi: 10.3791/3993.
5
Communicating Compositional Patterns.传达构成模式
Open Mind (Camb). 2020 Aug 1;4:25-39. doi: 10.1162/opmi_a_00032. eCollection 2020.
6
The Cultural Evolution of Structured Languages in an Open-Ended, Continuous World.在一个开放式、持续发展的世界中结构化语言的文化演变
Cogn Sci. 2017 May;41(4):892-923. doi: 10.1111/cogs.12371. Epub 2016 Apr 7.
7
Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms.用于存储电子健康记录表型算法可计算定义的语义网技术评估。
AMIA Annu Symp Proc. 2018 Apr 16;2017:1352-1361. eCollection 2017.
8
Using Structured Codes and Free-Text Notes to Measure Information Complementarity in Electronic Health Records: Feasibility and Validation Study.使用结构化编码和自由文本注释来衡量电子健康记录中的信息互补性:可行性与验证研究。
J Med Internet Res. 2025 Feb 13;27:e66910. doi: 10.2196/66910.
9
Complexity-entropy analysis at different levels of organisation in written language.书面语言在不同组织层次上的复杂性-熵分析。
PLoS One. 2019 May 8;14(5):e0214863. doi: 10.1371/journal.pone.0214863. eCollection 2019.
10
Comparing Natural Language Processing and Structured Medical Data to Develop a Computable Phenotype for Patients Hospitalized Due to COVID-19: Retrospective Analysis.比较自然语言处理和结构化医学数据以开发COVID-19住院患者的可计算表型:回顾性分析
JMIR Med Inform. 2023 Aug 22;11:e46267. doi: 10.2196/46267.

本文引用的文献

1
Assembly theory explains and quantifies selection and evolution.组装理论解释和量化了选择和进化。
Nature. 2023 Oct;622(7982):321-328. doi: 10.1038/s41586-023-06600-9. Epub 2023 Oct 4.
2
Identifying molecules as biosignatures with assembly theory and mass spectrometry.利用组装理论和质谱法鉴定生物标志物分子。
Nat Commun. 2021 May 24;12(1):3033. doi: 10.1038/s41467-021-23258-x.
3
Speakers and listeners exploit word order for communicative efficiency: A cross-linguistic investigation.说话者和听话者利用语序提高交际效率:一项跨语言研究。
J Exp Psychol Gen. 2021 Mar;150(3):583-594. doi: 10.1037/xge0000963. Epub 2020 Sep 24.
4
Universality in eye movements and reading: A trilingual investigation.眼球运动与阅读的普遍性:一项三语研究。
Cognition. 2016 Feb;147:1-20. doi: 10.1016/j.cognition.2015.10.013. Epub 2015 Nov 19.
5
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
6
The imitation game--a computational chemical approach to recognizing life.模仿游戏——一种识别生命的计算化学方法。
Nat Biotechnol. 2006 Oct;24(10):1203-6. doi: 10.1038/nbt1006-1203.
7
Quantitative tools for comparing animal communication systems: information theory applied to bottlenose dolphin whistle repertoires.用于比较动物通讯系统的定量工具:信息论应用于宽吻海豚的口哨声库
Anim Behav. 1999 Feb;57(2):409-419. doi: 10.1006/anbe.1998.1000.