Suppr超能文献

形态特征中的信息。

Information in morphological characters.

作者信息

Yu Congyu, Jiangzuo Qigao, Tschopp Emanuel, Wang Haibing, Norell Mark

机构信息

Division of Paleontology American Museum of Natural History New York NY USA.

Department of Earth and Environmental Sciences Columbia University New York NY USA.

出版信息

Ecol Evol. 2021 Aug 4;11(17):11689-11699. doi: 10.1002/ece3.7874. eCollection 2021 Sep.

Abstract

The construction of morphological character matrices is central to paleontological systematic study, which extracts paleontological information from fossils. Although the word information has been repeatedly mentioned in a wide array of paleontological systematic studies, its meaning has rarely been clarified nor specifically defined. It is important, however, to establish a standard to measure paleontological information because fossils are hardly complete, rendering the recognition of homologous and homoplastic structures difficult. Here, based on information theory, we show the deep connections between paleontological systematic study and communication system engineering. Information is defined as the decrease of uncertainty and it is the information in morphological characters that allows distinguishing operational taxonomic units (OTUs) and reconstructing evolutionary history. We propose that concepts in communication system engineering such as source coding and channel coding, correspond to the construction of diagnostic features and the entire character matrices in paleontological studies. The two coding strategies should be distinguished following typical communication system engineering, because they serve dual purposes. With character matrices from six different vertebrate groups, we analyzed their information properties including source entropy, mutual information, and channel capacity. Estimation of channel capacity shows character saturation of all matrices in transmitting paleontological information, indicating that, due to the presence of noise, oversampling characters not only increases the burden in character scoring, but also may decrease quality of matrices. We further test the use of information entropy, which measures how informative a variable is, as a character weighting criterion in parsimony-based systematic studies. The results show high consistency with existing knowledge with both good resolution and interpretability.

摘要

形态特征矩阵的构建是古生物学系统研究的核心,该研究从化石中提取古生物学信息。尽管“信息”一词在众多古生物学系统研究中被反复提及,但其含义却很少得到阐明或明确界定。然而,建立一个衡量古生物学信息的标准很重要,因为化石几乎不完整,这使得识别同源和同塑性结构变得困难。在此,基于信息论,我们展示了古生物学系统研究与通信系统工程之间的深层联系。信息被定义为不确定性的减少,正是形态特征中的信息使得区分操作分类单元(OTU)和重建进化历史成为可能。我们提出,通信系统工程中的概念,如源编码和信道编码,分别对应古生物学研究中诊断特征的构建和整个特征矩阵的构建。按照典型的通信系统工程,这两种编码策略应加以区分,因为它们具有双重目的。利用来自六个不同脊椎动物类群的特征矩阵,我们分析了它们的信息属性,包括源熵、互信息和信道容量。信道容量的估计表明,所有矩阵在传输古生物学信息时都存在特征饱和现象,这表明,由于噪声的存在,过度采样特征不仅增加了特征评分的负担,还可能降低矩阵的质量。我们进一步测试了信息熵作为基于简约法的系统研究中特征加权标准的应用,信息熵衡量的是一个变量的信息量。结果与现有知识高度一致,具有良好的分辨率和可解释性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa98/8427622/44539ba855cc/ECE3-11-11689-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验