Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO, United States of America.
PLoS Comput Biol. 2018 Sep 24;14(9):e1006256. doi: 10.1371/journal.pcbi.1006256. eCollection 2018 Sep.
Proteins with low-complexity domains continue to emerge as key players in both normal and pathological cellular processes. Although low-complexity domains are often grouped into a single class, individual low-complexity domains can differ substantially with respect to amino acid composition. These differences may strongly influence the physical properties, cellular regulation, and molecular functions of low-complexity domains. Therefore, we developed a bioinformatic approach to explore relationships between amino acid composition, protein metabolism, and protein function. We find that local compositional enrichment within protein sequences is associated with differences in translation efficiency, abundance, half-life, protein-protein interaction promiscuity, subcellular localization, and molecular functions of proteins on a proteome-wide scale. However, local enrichment of related amino acids is sometimes associated with opposite effects on protein regulation and function, highlighting the importance of distinguishing between different types of low-complexity domains. Furthermore, many of these effects are discernible at amino acid compositions below those required for classification as low-complexity or statistically-biased by traditional methods and in the absence of homopolymeric amino acid repeats, indicating that thresholds employed by classical methods may not reflect biologically relevant criteria. Application of our analyses to composition-driven processes, such as the formation of membraneless organelles, reveals distinct composition profiles even for closely related organelles. Collectively, these results provide a unique perspective and detailed insights into relationships between amino acid composition, protein metabolism, and protein functions.
具有低复杂度结构域的蛋白质继续成为正常和病理细胞过程中的关键参与者。虽然低复杂度结构域通常被归为一类,但个别低复杂度结构域在氨基酸组成上可能存在很大差异。这些差异可能强烈影响低复杂度结构域的物理性质、细胞调控和分子功能。因此,我们开发了一种生物信息学方法来探索氨基酸组成、蛋白质代谢和蛋白质功能之间的关系。我们发现,蛋白质序列中局部组成的富集与翻译效率、丰度、半衰期、蛋白质-蛋白质相互作用的混杂性、亚细胞定位和蛋白质组范围内蛋白质的分子功能的差异有关。然而,相关氨基酸的局部富集有时与蛋白质调控和功能的相反效应有关,这突出了区分不同类型的低复杂度结构域的重要性。此外,许多这些效应在低于传统方法分类或统计上偏向所需的氨基酸组成下即可被察觉,并且在不存在同聚氨基酸重复的情况下也可被察觉,这表明经典方法所采用的阈值可能无法反映生物学上相关的标准。我们的分析方法在膜性细胞器形成等组成驱动的过程中的应用,甚至揭示了密切相关的细胞器之间的独特组成特征。总的来说,这些结果为氨基酸组成、蛋白质代谢和蛋白质功能之间的关系提供了独特的视角和详细的见解。