Herlihy W C, Royal N J, Biemann K, Putney S D, Schimmel P R
Proc Natl Acad Sci U S A. 1980 Nov;77(11):6531-5. doi: 10.1073/pnas.77.11.6531.
A strategy has been developed for rapid and accurate determination of the amino acid sequence of large proteins, such as many of the members of the class of proteins known as aminoacyl tRNA synthetases. This strategy involves combining DNA sequencing of the gene for the protein of interest with gas chromatographic mass spectrometric identification of tetra- and pentapeptides in partial hydrolysates of the entire protein or very large fragments thereof. These peptides are matched to blocks of codons at locations scattered throughout the entire structural gene. Tetra- and pentapeptide sequences are sufficiently long that they are unlikely to be repeated in the protein sequence or to occur in an incorrect reading frame; therefore, they can be placed at unique clusters of codons on the DNA. This procedure rigorously establishes the proper phasing of the DNA throughout the entire length of the structural gene, and the protein sequence is thereby accurately read from the DNA sequence. This approach is being used to determine the amino acid sequence of EScherichia coli alanine tRNA synthetase, a protein that has approximately 900 amino acids. This paper reports the sequence of the first 165 amino acids from the NH2 terminus.
已开发出一种策略,用于快速准确地测定大蛋白质的氨基酸序列,比如许多属于氨酰tRNA合成酶类别的蛋白质成员。该策略包括将目标蛋白质的基因进行DNA测序,同时对整个蛋白质或其非常大的片段的部分水解产物中的四肽和五肽进行气相色谱 - 质谱鉴定。这些肽段与散布在整个结构基因中的密码子区段相匹配。四肽和五肽序列足够长,以至于它们不太可能在蛋白质序列中重复出现,也不太可能出现在错误的阅读框中;因此,它们可以定位在DNA上独特的密码子簇上。这个过程严格确定了整个结构基因长度上DNA的正确相位,从而可以从DNA序列准确读取蛋白质序列。这种方法正用于测定大肠杆菌丙氨酸tRNA合成酶的氨基酸序列,该蛋白质约有900个氨基酸。本文报道了从NH2末端起的前165个氨基酸的序列。