大肠杆菌的蛋白质大小为14 kDa的倍数：结构域概念及进化意义

Proteins of Escherichia coli come in sizes that are multiples of 14 kDa: domain concepts and evolutionary implications.

作者信息

Savageau M A

出版信息

Proc Natl Acad Sci U S A. 1986 Mar;83(5):1198-202. doi: 10.1073/pnas.83.5.1198.

DOI:10.1073/pnas.83.5.1198

PMID:3513170

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC323042/

Abstract

Initial attempts to correlate the distribution of gene density (number of gene loci per unit length on the linkage map) with the distribution of lengths of coding sequences have led to the observation that 46% of approximately 1000 sampled proteins in Escherichia coli have molecular masses of n X 14,000 +/- 2500 daltons (n = 1, 2, ...). This clustering around multiples of 14,000 contrasts with the 36% one would expect in these ranges if the sizes were uniformly distributed. The entire distribution is well fit by a sum of normal or lognormal distributions located at multiples of 14,000, which suggests that the percentage of E. coli proteins governed by the underlying sizing mechanism is much greater than 50%. Clustering of protein molecular sizes around multiples of a unit size also is suggested by the distribution of well-characterized HeLa cell proteins. The distribution of gene lengths for E. coli suggests regular clustering, which implies that the clustering of protein molecular masses is not an artifact of the molecular mass measurement by gel electrophoresis. These observations suggest the existence of a fundamental structural unit. The rather uniform size of this structural unit (without any apparent sequence homology) suggests that a general principle such as geometrical or physical optimization at the DNA or protein level is responsible. This suggestion is discussed in relation to experimental evidence for the domain structure of proteins and to existing hypotheses that attempt to account for these domains. Microevolution would appear to be accommodated by incremental changes within this fundamental unit, whereas macroevolution would appear to involve "quantum" changes to the next stable size of protein.

摘要

最初尝试将基因密度（连锁图谱上每单位长度的基因座数量）分布与编码序列长度分布关联起来，结果发现大肠杆菌中约1000个抽样蛋白质里有46%的分子量为n×14,000±2500道尔顿（n = 1、2……）。这种围绕14,000倍数的聚类现象，与如果大小均匀分布时在这些范围内预期的36%形成对比。整个分布可以很好地由位于14,000倍数处的正态分布或对数正态分布之和拟合，这表明受潜在大小确定机制控制的大肠杆菌蛋白质百分比远大于50%。特征明确的海拉细胞蛋白质的分布也表明蛋白质分子大小围绕单位大小的倍数聚类。大肠杆菌基因长度的分布表明存在规则聚类，这意味着蛋白质分子量的聚类不是凝胶电泳测量分子量的人为产物。这些观察结果表明存在一个基本结构单元。这个结构单元相当均匀的大小（没有任何明显的序列同源性）表明，诸如DNA或蛋白质水平的几何或物理优化等一般原则是其原因。结合蛋白质结构域结构的实验证据以及试图解释这些结构域的现有假说，对这一观点进行了讨论。微观进化似乎可以通过这个基本单元内的增量变化来实现，而宏观进化似乎涉及到蛋白质下一个稳定大小的“量子”变化。

相似文献

Proteins of Escherichia coli come in sizes that are multiples of 14 kDa: domain concepts and evolutionary implications.大肠杆菌的蛋白质大小为14 kDa的倍数：结构域概念及进化意义

Proc Natl Acad Sci U S A. 1986 Mar;83(5):1198-202. doi: 10.1073/pnas.83.5.1198.

Gene density over the chromosome of Escherichia coli: frequency distribution, spatial clustering, and symmetry.大肠杆菌染色体上的基因密度：频率分布、空间聚类和对称性。

J Bacteriol. 1985 Aug;163(2):806-11. doi: 10.1128/jb.163.2.806-811.1985.

Amino acid sequence homology among the major outer membrane proteins of Escherichia coli.大肠杆菌主要外膜蛋白之间的氨基酸序列同源性。

Proc Natl Acad Sci U S A. 1984 Feb;81(4):1048-52. doi: 10.1073/pnas.81.4.1048.

Isolation and sequencing of Escherichia coli gene proP reveals unusual structural features of the osmoregulatory proline/betaine transporter, ProP.大肠杆菌基因proP的分离与测序揭示了渗透调节性脯氨酸/甜菜碱转运蛋白ProP不同寻常的结构特征。

J Mol Biol. 1993 Jan 5;229(1):268-76. doi: 10.1006/jmbi.1993.1030.

Periodic distribution of homologous genes or gene segments on the Escherichia coli K12 genome.大肠杆菌K12基因组上同源基因或基因片段的周期性分布。

Protein Seq Data Anal. 1988;1(4):263-7.

The primary structure of the DeoR repressor from Escherichia coli K-12.来自大肠杆菌K-12的DeoR阻遏物的一级结构。

Nucleic Acids Res. 1985 Aug 26;13(16):5927-36. doi: 10.1093/nar/13.16.5927.

Sequences of the E. coli uvrC gene and protein.大肠杆菌uvrC基因及蛋白质序列。

Nucleic Acids Res. 1984 Jun 11;12(11):4593-608. doi: 10.1093/nar/12.11.4593.

Comparing the predicted and observed properties of proteins encoded in the genome of Escherichia coli K-12.比较大肠杆菌K-12基因组中编码蛋白质的预测特性和观察到的特性。

Electrophoresis. 1997 Aug;18(8):1259-313. doi: 10.1002/elps.1150180807.

ILG1 : a new integrase-like gene that is a marker of bacterial contamination by the laboratory Escherichia coli strain TOP10F'.ILG1：一种新的类整合酶基因，是实验室大肠杆菌菌株TOP10F'细菌污染的标志物。

Mol Med. 2002 Jul;8(7):405-16.

Proc Natl Acad Sci U S A. 1995 Dec 5;92(25):11921-5. doi: 10.1073/pnas.92.25.11921.

引用本文的文献

Gut Microbiota Serves as a Crucial Independent Biomarker in Inflammatory Bowel Disease (IBD).肠道微生物群是炎症性肠病（IBD）中一个关键的独立生物标志物。

Int J Mol Sci. 2025 Mar 11;26(6):2503. doi: 10.3390/ijms26062503.

Hierarchical Analysis of Protein Structures: From Secondary Structures to Protein Units and Domains.蛋白质结构的层次分析：从二级结构到蛋白质单元和结构域。

Methods Mol Biol. 2025;2870:357-370. doi: 10.1007/978-1-0716-4213-9_18.

Scalable, robust, high-throughput expression & purification of nanobodies enabled by 2-stage dynamic control.通过两阶段动态控制实现纳米抗体的可扩展、稳健、高通量表达和纯化。

Metab Eng. 2024 Sep;85:116-130. doi: 10.1016/j.ymben.2024.07.012. Epub 2024 Jul 24.

The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12.大肠杆菌K-12中DNA结合转录调节因子的全部组成

Nucleic Acids Res. 2000 Apr 15;28(8):1838-47. doi: 10.1093/nar/28.8.1838.

Detection of fundamental principles and a level of order for large-scale gene clustering on the Escherichia coli chromosome.大肠杆菌染色体上大规模基因聚类的基本原理及有序程度的检测

J Mol Evol. 1993 Apr;36(4):347-60. doi: 10.1007/BF00182182.

Underlying order in protein sequence organization.蛋白质序列组织中的潜在秩序。

Proc Natl Acad Sci U S A. 1994 Apr 26;91(9):4044-7. doi: 10.1073/pnas.91.9.4044.

On the recombinational origin of protein-sequence-subunit structure.论蛋白质序列 - 亚基结构的重组起源。

J Mol Evol. 1994 May;38(5):543-6. doi: 10.1007/BF00178853.

Periodic recurrence of methionines: fossil of gene fusion?甲硫氨酸的周期性重复：基因融合的化石？

Proc Natl Acad Sci U S A. 1995 Jan 17;92(2):557-60. doi: 10.1073/pnas.92.2.557.

Segmented structure of protein sequences and early evolution of genome by combinatorial fusion of DNA elements.蛋白质序列的分段结构与基因组通过DNA元件的组合融合实现的早期进化。

J Mol Evol. 1995 Mar;40(3):337-42. doi: 10.1007/BF00163239.

Origin of noncoding DNA sequences: molecular fossils of genome evolution.非编码DNA序列的起源：基因组进化的分子化石

Proc Natl Acad Sci U S A. 1987 Sep;84(17):6195-9. doi: 10.1073/pnas.84.17.6195.

本文引用的文献

The evolutionary origins of the immunoglobulins.免疫球蛋白的进化起源。

Proc Natl Acad Sci U S A. 1966 Dec;56(6):1762-9. doi: 10.1073/pnas.56.6.1762.

Correlation of DNA exonic regions with protein structural units in haemoglobin.血红蛋白中DNA外显子区域与蛋白质结构单元的相关性。

Nature. 1981 May 7;291(5810):90-2. doi: 10.1038/291090a0.

The anatomy and taxonomy of protein structure.蛋白质结构的解剖学与分类学。

Adv Protein Chem. 1981;34:167-339. doi: 10.1016/s0065-3233(08)60520-3.

Relation between structure and function of alpha/beta-proteins.α/β 蛋白的结构与功能之间的关系。

Q Rev Biophys. 1980 Aug;13(3):317-38. doi: 10.1017/s0033583500001712.

Structure of chromatin and the linking number of DNA.染色质结构与DNA的连环数

Proc Natl Acad Sci U S A. 1981 Mar;78(3):1461-5. doi: 10.1073/pnas.78.3.1461.

The higher-order structure of chromatin: evidence for a helical ribbon arrangement.染色质的高阶结构：螺旋带排列的证据。

J Cell Biol. 1984 Jul;99(1 Pt 1):42-52. doi: 10.1083/jcb.99.1.42.

Exons--present from the beginning?外显子——从一开始就存在吗？

Nature. 1983;306(5943):535-7. doi: 10.1038/306535a0.

Mouse T cell antigen receptor: structure and organization of constant and joining gene segments encoding the beta polypeptide.小鼠T细胞抗原受体：编码β多肽的恒定区和连接区基因片段的结构与组织

Cell. 1984 Jul;37(3):1101-10. doi: 10.1016/0092-8674(84)90444-6.

Modular structural units, exons, and function in chicken lysozyme.模块化结构单元、外显子与鸡溶菌酶的功能

Proc Natl Acad Sci U S A. 1983 Apr;80(7):1964-8. doi: 10.1073/pnas.80.7.1964.

Immunoglobulin genes.免疫球蛋白基因

Annu Rev Immunol. 1983;1:499-528. doi: 10.1146/annurev.iy.01.040183.002435.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验