Chagoyen Mónica, Poyatos Juan F
Computational Systems Biology Group (CNB-CSIC), Madrid E-28049, Spain.
Logic of Genomic Systems Laboratory (CNB-CSIC), Madrid E-28049, Spain.
PNAS Nexus. 2025 Jan 16;4(1):pgaf008. doi: 10.1093/pnasnexus/pgaf008. eCollection 2025 Jan.
While more data are becoming available on gene activity at different levels of biological organization, our understanding of the underlying biology remains incomplete. Here, we introduce a metabolic efficiency framework that considers highly expressed proteins (HEPs), their length, and biosynthetic costs in terms of the amino acids (AAs) they contain to address the observed balance of expression costs in cells, tissues, and cancer transformation. Notably, the combined set of HEPs in either cells or tissues shows an abundance of large and costly proteins, yet tissues compensate this with short HEPs comprised of economical AAs, indicating a stronger tendency toward mitigating costs. We additionally observe that short proteins are prevalent HEPs across individual cells and tissues, whereas long ones are more specific. Furthermore, the precise proportion of short, long, economical, or costly HEP classes indicates that particular cell types and tissues align more closely with the metabolic efficiency model, with some tissues displaying behavior akin to their constituent cells. Finally, tumors typically increase the production of short and low-cost HEPs compared with matched normal tissues, while genes that decrease their high expression levels in tumors often tend to be associated with high costs. Overall, the metabolic efficiency framework serves as a useful simplifying model for interpreting genome-wide expression data across scales.
虽然关于生物组织不同层面基因活性的更多数据不断涌现,但我们对其潜在生物学机制的理解仍不完整。在此,我们引入一个代谢效率框架,该框架从高表达蛋白(HEPs)、其长度以及所含氨基酸(AAs)的生物合成成本方面进行考量,以解决在细胞、组织及癌症转变中观察到的表达成本平衡问题。值得注意的是,细胞或组织中的高表达蛋白组合显示出大量大且成本高的蛋白,但组织通过由经济氨基酸组成的短高表达蛋白来弥补这一点,这表明在降低成本方面有更强的倾向。我们还观察到,短蛋白是单个细胞和组织中普遍存在的高表达蛋白,而长蛋白则更具特异性。此外,短、长、经济或成本高的高表达蛋白类别的精确比例表明,特定的细胞类型和组织与代谢效率模型的契合度更高,一些组织表现出与其组成细胞相似的行为。最后,与匹配的正常组织相比,肿瘤通常会增加短且低成本的高表达蛋白的产生,而在肿瘤中降低其高表达水平的基因往往与高成本相关。总体而言,代谢效率框架是一个有用的简化模型,用于跨尺度解释全基因组表达数据。