Department of Biochemistry, UT Southwestern Medical Center, Dallas, Texas 75390-9152, USA
Genes Dev. 2024 Apr 17;38(5-6):205-212. doi: 10.1101/gad.351465.123.
This perspective begins with a speculative consideration of the properties of the earliest proteins to appear during evolution. What did these primitive proteins look like, and how were they of benefit to early forms of life? I proceed to hypothesize that primitive proteins have been preserved through evolution and now serve diverse functions important to the dynamics of cell morphology and biological regulation. The primitive nature of these modern proteins is easy to spot. They are composed of a limited subset of the 20 amino acids used by traditionally evolved proteins and thus are of low sequence complexity. This chemical simplicity limits protein domains of low sequence complexity to forming only a crude and labile type of protein structure currently hidden from the computational powers of machine learning. I conclude by hypothesizing that this structural weakness represents the underlying virtue of proteins that, at least for the moment, constitute the dark matter of the proteome.
本文从推测性的角度出发,探讨了在进化过程中最早出现的蛋白质的特性。这些原始蛋白质是什么样子的,它们对早期生命形式有什么益处?我进一步假设,原始蛋白质在进化过程中得以保留,现在具有多种对细胞形态和生物调节动力学至关重要的功能。这些现代蛋白质的原始性质很容易被发现。它们由传统进化蛋白质所使用的 20 种氨基酸的有限子集组成,因此序列复杂度较低。这种化学简单性限制了低序列复杂度的蛋白质结构域只能形成一种粗糙且不稳定的蛋白质结构类型,目前这种结构类型隐藏在机器学习的计算能力之外。最后,我假设这种结构上的弱点是蛋白质的潜在优势,至少目前,蛋白质是蛋白质组的暗物质。