Kusuma School of Biological Sciences, Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India.
Supercomputing Facility for Bioinformatics & Computational Biology, Indian Institute of Technology Delhi (IIT Delhi), New Delhi, India.
J Biomol Struct Dyn. 2021 May;39(8):2885-2893. doi: 10.1080/07391102.2020.1756410. Epub 2020 Apr 27.
Intrinsically disordered proteins are now widely accepted to play crucial roles in biological functions. Identification of signatures of intrinsic disorder is one of the key steps towards building a proper repertoire for their occurrence in proteomes. In this work, systematic computational synthesis of a library of all possible (3368400) dipeptides, tripeptides, tetrapeptides and pentapeptides using the natural 20 amino acids allowed us to identify 36 unique tetrapeptides present exclusively in intrinsically disordered proteins and absent in the complete primary sequence space of naturally occurring structured proteins. Further, out of more than 530000 known naturally occurring primary sequences without any structural information, 1349 sequences contain the above identified unique signatures of intrinsic disorder. These sequences, having cellular functions varying from housekeeping to metabolic to transport, more than double the number of the currently known intrinsically disordered proteins. On similar lines, we report that 26577 pentapeptide signatures exclusive to intrinsically disordered proteins, and absent in naturally occurring structured proteins, identify ∼50% of more than half-a-million curated protein sequences without structural information to be intrinsically disordered. The results reported are a major leap forward in exploring functional manifestations of intrinsically disordered proteins.Communicated by Ramaswamy H. Sarma.
现在广泛认为,无序蛋白质在生物功能中起着至关重要的作用。鉴定内在无序的特征是构建其在蛋白质组中发生的适当特征的关键步骤之一。在这项工作中,我们使用天然的 20 种氨基酸,系统地计算合成了一个由所有可能的(3368400)二肽、三肽、四肽和五肽组成的文库,从中我们鉴定出了 36 种独特的四肽,它们仅存在于无序蛋白质中,而不存在于天然结构蛋白质的完整一级序列空间中。此外,在超过 530000 个已知的没有任何结构信息的天然存在的一级序列中,有 1349 个序列包含上述鉴定出的内在无序特征的独特特征。这些序列具有从管家到代谢到运输等多种细胞功能,其数量是目前已知的无序蛋白质的两倍多。类似地,我们报告说,26577 种无序蛋白质特有的五肽特征,不存在于天然结构蛋白质中,能够识别出超过 50 万条没有结构信息的经过精心编辑的蛋白质序列中的约 50%为无序蛋白质。报告的结果在探索无序蛋白质的功能表现方面是一个重大的飞跃。由 Ramaswamy H. Sarma 交流。