Sen Arpita, Hsieh Wen-Chieh, Aguilar R Claudio
Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA.
Current address, Dept. of Molecular & Cell Biology, University of California, Berkeley.
Proc IEEE Inst Electr Electron Eng. 2017 Feb;105(2):385-393. doi: 10.1109/JPROC.2016.2613076. Epub 2016 Dec 1.
The presence of abnormally expanded glutamine (Q) repeats within specific proteins ( huntingtin) are the well-established cause of several neurogenerative diseases, including Huntington disease and spinocerebellar ataxias. However, the impact of "expanded Q" stretches on the protein function is not well-understood, mostly due to lack of knowledge about the physiological role of Q repeats and the mechanism by which these repeats achieve functional-specificity. Indeed, is intriguing that regions with such low complexity (low information content) can display exquisite functional specificity. Prompting the question: where is this information stored? Applying biochemical/structural constraints and statistical analysis of protein composition we identified Q-rich (Q) regions present in coiled coils of yeast transcription factors and endocytic proteins. Our analysis indicated the existence of non-Q amino-acids differentially enriched or excluded from Q regions in one protein group versus the other. Importantly, when the non-Q amino-acids from an endocytic protein were exchanged by the ones enriched in Q from transcription factors, the resulting protein was unable to localize to the plasma membrane and was instead found in the nucleus. These results indicate that while Q repeats can efficiently engage in binding, the non-Q amino-acids provide essential specificity information. We speculate that coupling low complexity regions with information-intensive determinants might be a strategy used in many protein systems involved in different biological processes.
特定蛋白质(亨廷顿蛋白)中异常扩展的谷氨酰胺(Q)重复序列的存在是包括亨廷顿病和脊髓小脑共济失调在内的几种神经退行性疾病的公认病因。然而,“扩展的Q”片段对蛋白质功能的影响尚未得到充分理解,这主要是由于对Q重复序列的生理作用以及这些重复序列实现功能特异性的机制缺乏了解。实际上,令人感兴趣的是,如此低复杂性(低信息含量)的区域竟能表现出精确的功能特异性。这就引发了一个问题:这种信息存储在哪里?通过应用生化/结构限制以及对蛋白质组成的统计分析,我们确定了酵母转录因子和内吞蛋白的卷曲螺旋中存在富含Q的(Q)区域。我们的分析表明,在一个蛋白质组与另一个蛋白质组的Q区域中,存在差异富集或排除的非Q氨基酸。重要的是,当一个内吞蛋白中的非Q氨基酸被转录因子中富含Q的氨基酸替换时,产生的蛋白质无法定位于质膜,而是出现在细胞核中。这些结果表明,虽然Q重复序列能够有效地参与结合,但非Q氨基酸提供了关键的特异性信息。我们推测,将低复杂性区域与信息密集型决定因素相结合可能是许多参与不同生物过程的蛋白质系统所采用的一种策略。