Nemzer Louis R
Nova Southeastern University, Chemistry and Physics, 3301 College Ave., Davie, FL 33314, United States.
J Theor Biol. 2017 Feb 21;415:158-170. doi: 10.1016/j.jtbi.2016.12.010. Epub 2016 Dec 20.
The Shannon entropy measures the expected information value of messages. As with thermodynamic entropy, the Shannon entropy is only defined within a system that identifies at the outset the collections of possible messages, analogous to microstates, that will be considered indistinguishable macrostates. This fundamental insight is applied here for the first time to amino acid alphabets, which group the twenty common amino acids into families based on chemical and physical similarities. To evaluate these schemas objectively, a novel quantitative method is introduced based the inherent redundancy in the canonical genetic code. Each alphabet is taken as a separate system that partitions the 64 possible RNA codons, the microstates, into families, the macrostates. By calculating the normalized mutual information, which measures the reduction in Shannon entropy, conveyed by single nucleotide messages, groupings that best leverage this aspect of fault tolerance in the code are identified. The relative importance of properties related to protein folding - like hydropathy and size - and function, including side-chain acidity, can also be estimated. This approach allows the quantification of the average information value of nucleotide positions, which can shed light on the coevolution of the canonical genetic code with the tRNA-protein translation mechanism.
香农熵衡量消息的预期信息值。与热力学熵一样,香农熵仅在一个系统内定义,该系统从一开始就确定可能消息的集合,类似于微观状态,这些消息将被视为不可区分的宏观状态。这一基本见解首次应用于氨基酸字母表,该字母表根据化学和物理相似性将二十种常见氨基酸分组为不同家族。为了客观地评估这些模式,基于标准遗传密码中的固有冗余引入了一种新颖的定量方法。每个字母表都被视为一个独立的系统,该系统将64种可能的RNA密码子(微观状态)划分为不同家族(宏观状态)。通过计算归一化互信息,该信息衡量单核苷酸消息所传达的香农熵的减少,从而确定最能利用密码中容错这一方面的分组。与蛋白质折叠相关的特性(如亲水性和大小)以及功能(包括侧链酸度)的相对重要性也可以估计。这种方法允许对核苷酸位置的平均信息值进行量化,这可以揭示标准遗传密码与tRNA-蛋白质翻译机制的共同进化。