Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York, 14853.
Proteins. 2013 Oct;81(10):1681-5. doi: 10.1002/prot.24328. Epub 2013 Aug 13.
Delineation of the relationship between sequence and structure in proteins has proven elusive. Most studies of this problem use alignment methods and other approaches based on the characteristics of individual residues. It is demonstrated herein that the sequence-structure relationship is determined in significant part by global characteristics of sequence organization. Information encoded in complete sequences is required to distinguish proteins in different architectural groups. It is found that the statistically significant differences between sequences encoding different architectures are encoded in a surprisingly small set of low-wave-number sequence periodicities. It would therefore appear that unexpected simplicity in an appropriately defined Fourier space may be an inherent characteristic of the sequences of folded proteins.
蛋白质中序列和结构之间的关系一直难以确定。这个问题的大多数研究都使用基于单个残基特征的对齐方法和其他方法。本文证明,序列-结构关系在很大程度上取决于序列组织的全局特征。为了区分不同结构的蛋白质,需要使用完整序列中编码的信息。研究发现,编码不同结构的序列之间存在显著差异,这些差异编码在一小部分低频序列周期性中。因此,在适当定义的傅立叶空间中出现的出人意料的简单性似乎是折叠蛋白质序列的固有特征。