Perlman D, Halvorson H O
J Mol Biol. 1983 Jun 25;167(2):391-409. doi: 10.1016/s0022-2836(83)80341-6.
Presecretory signal peptides of 39 proteins from diverse prokaryotic and eukaryotic sources have been compared. Although varying in length and amino acid composition, the labile peptides share a hydrophobic core of approximately 12 amino acids. A positively charged residue (Lys or Arg) usually precedes the hydrophobic core. Core termination is defined by the occurrence of a charged residue, a sequence of residues which may induce a beta-turn in a polypeptide, or an interruption in potential alpha-helix or beta-extended strand structure. The hydrophobic cores contain, by weight average, 37% Leu: 15% Ala: 10% Val: 10% Phe: 7% Ile plus 21% other hydrophobic amino acids arranged in a non-random sequence. Following the hydrophobic cores (aligned by their last residue) a highly non-random and localized distribution of Ala is apparent within the initial eight positions following the core: (formula; see text) Coincident with this observation, Ala-X-Ala is the most frequent sequence preceding signal peptidase cleavage. We propose the existence of a signal peptidase recognition sequence A-X-B with the preferred cleavage site located after the sixth amino acid following the core sequence. Twenty-two of the above 27 underlined Ala residues would participate as A or B in peptidase cleavage. Position A includes the larger aliphatic amino acids, Leu, Val and Ile, as well as the residues already found at B (principally Ala, Gly and Ser). Since a preferred cleavage site can be discerned from carboxyl and not amino terminal alignment of the hydrophobic cores it is proposed that the carboxyl ends are oriented inward toward the lumen of the endoplasmic reticulum where cleavage is thought to occur. This orientation coupled with the predicted beta-turn typically found between the core and the cleavage site implies reverse hairpin insertion of the signal sequence. The structural features which we describe should help identify signal peptides and cleavage sites in presumptive amino acid sequences derived from DNA sequences.
对来自不同原核生物和真核生物来源的39种蛋白质的分泌前信号肽进行了比较。尽管这些不稳定肽的长度和氨基酸组成各不相同,但它们共享一个约12个氨基酸的疏水核心。一个带正电荷的残基(赖氨酸或精氨酸)通常位于疏水核心之前。核心终止由一个带电荷的残基、一个可能在多肽中诱导β-转角的残基序列、或潜在α-螺旋或β-延伸链结构的中断来定义。疏水核心按重量平均含有37%的亮氨酸、15%的丙氨酸、10%的缬氨酸、10%的苯丙氨酸、7%的异亮氨酸以及21%的其他疏水氨基酸,它们以非随机序列排列。在疏水核心(按其最后一个残基对齐)之后,丙氨酸在核心后的最初八个位置内呈现出高度非随机且局部化的分布:(公式;见正文)与这一观察结果一致,丙氨酸- X - 丙氨酸是信号肽酶切割之前最常见的序列。我们提出存在一个信号肽酶识别序列A - X - B,其优选切割位点位于核心序列之后的第六个氨基酸之后。上述27个下划线的丙氨酸残基中有22个会作为A或B参与肽酶切割。位置A包括较大的脂肪族氨基酸、亮氨酸、缬氨酸和异亮氨酸,以及已经在B位置发现的残基(主要是丙氨酸、甘氨酸和丝氨酸)。由于可以从疏水核心的羧基而非氨基末端对齐中辨别出优选切割位点,因此提出羧基末端向内朝向内质网腔,切割被认为发生在内质网腔中。这种取向与通常在核心和切割位点之间发现的预测β-转角相结合,意味着信号序列的反向发夹插入。我们描述的结构特征应有助于在从DNA序列推导的假定氨基酸序列中识别信号肽和切割位点。