Department of Biology, McGill University, Montreal, Quebec, Canada H3A 1B1.
Protein Sci. 2011 Mar;20(3):567-79. doi: 10.1002/pro.590.
Identification of ambiguous encoding in protein secondary structure is paramount to develop an understanding of key protein segments underlying amyloid diseases. We investigate two types of structurally ambivalent peptides, which were hypothesized in the literature as indicators of amyloidogenic proteins: discordant α-helices and chameleon sequences. Chameleon sequences are peptides discovered experimentally in different secondary-structure types. Discordant α-helices are α-helical stretches with strong β-strand propensity or prediction. To assess the distribution of these features in known protein structures, and their potential role in amyloidogenesis, we analyzed the occurrence of discordant α-helices and chameleon sequences in nonredundant sets of protein domains (n = 4263) and amyloidogenic proteins extracted from the literature (n = 77). Discordant α-helices were identified if discordance was observed between known secondary structures and secondary-structure predictions from the GOR-IV and PSIPRED algorithms. Chameleon sequences were extracted by searching for identical sequence words in α-helices and β-strands. We defined frustrated chameleons and very frustrated chameleons based on varying degrees of total β propensity ≥α propensity. To our knowledge, this is the first study to discern statistical relationships between discordance, chameleons, and amyloidogenicity. We observed varying enrichment levels for some categories of discordant and chameleon sequences in amyloidogenic sequences. Chameleon sequences are also significantly enriched in proteins that have discordant helices, indicating a clear link between both phenomena. We identified the first set of discordant-chameleonic protein segments we predict may be involved in amyloidosis. We present a detailed analysis of discordant and chameleons segments in the family of one of the amyloidogenic proteins, the Prion Protein.
鉴定蛋白质二级结构中的歧义编码对于理解导致淀粉样变性疾病的关键蛋白质片段至关重要。我们研究了两种结构上有歧义的肽,它们在文献中被假设为淀粉样蛋白形成的指标:不和谐的α-螺旋和变色龙序列。变色龙序列是在不同二级结构类型中实验发现的肽。不和谐的α-螺旋是具有强烈β-链倾向或预测的α-螺旋伸展。为了评估这些特征在已知蛋白质结构中的分布及其在淀粉样变性形成中的潜在作用,我们分析了非冗余蛋白质结构域集(n = 4263)和从文献中提取的淀粉样变性蛋白中不和谐的α-螺旋和变色龙序列的发生情况(n = 77)。如果在已知二级结构和 GOR-IV 和 PSIPRED 算法的二级结构预测之间观察到不和谐,则鉴定出不和谐的α-螺旋。通过在α-螺旋和β-链中搜索相同的序列词来提取变色龙序列。我们根据总β倾向≥α倾向的不同程度定义了受挫的变色龙和非常受挫的变色龙。据我们所知,这是第一项研究,旨在区分不和谐、变色龙和淀粉样变性之间的统计关系。我们观察到一些类别中的不和谐和变色龙序列在淀粉样变性序列中的富集程度不同。变色龙序列在具有不和谐螺旋的蛋白质中也明显富集,表明这两种现象之间存在明显联系。我们确定了我们预测可能参与淀粉样变性的第一批不和谐-变色龙蛋白片段。我们对一种淀粉样变性蛋白,即朊病毒蛋白家族中的不和谐和变色龙片段进行了详细分析。