Horn D M, Zubarev R A, McLafferty F W
Department of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA.
Proc Natl Acad Sci U S A. 2000 Sep 12;97(19):10313-7. doi: 10.1073/pnas.97.19.10313.
A de novo sequencing program for proteins is described that uses tandem MS data from electron capture dissociation and collisionally activated dissociation of electrosprayed protein ions. Computer automation is used to convert the fragment ion mass values derived from these spectra into the most probable protein sequence, without distinguishing Leu/Ile. Minimum human input is necessary for the data reduction and interpretation. No extra chemistry is necessary to distinguish N- and C-terminal fragments in the mass spectra, as this is determined from the electron capture dissociation data. With parts-per-million mass accuracy (now available by using higher field Fourier transform MS instruments), the complete sequences of ubiquitin (8.6 kDa) and melittin (2.8 kDa) were predicted correctly by the program. The data available also provided 91% of the cytochrome c (12.4 kDa) sequence (essentially complete except for the tandem MS-resistant region K(13)-V(20) that contains the cyclic heme). Uncorrected mass values from a 6-T instrument still gave 86% of the sequence for ubiquitin, except for distinguishing Gln/Lys. Extensive sequencing of larger proteins should be possible by applying the algorithm to pieces of approximately 10-kDa size, such as products of limited proteolysis.
本文描述了一种用于蛋白质的从头测序程序,该程序使用电喷雾蛋白质离子的电子捕获解离和碰撞激活解离产生的串联质谱数据。利用计算机自动化将这些光谱得出的碎片离子质量值转换为最可能的蛋白质序列,不区分亮氨酸/异亮氨酸。数据简化和解释所需的人工干预最少。无需额外的化学方法来区分质谱中的N端和C端片段,因为这可根据电子捕获解离数据确定。凭借百万分之一的质量精度(现在使用更高场傅里叶变换质谱仪即可实现),该程序正确预测了泛素(8.6 kDa)和蜂毒肽(2.8 kDa)的完整序列。现有数据还提供了细胞色素c(12.4 kDa)序列的91%(除了包含环状血红素的串联质谱抗性区域K(13)-V(20)外基本完整)。来自6-T仪器的未校正质量值仍给出了泛素序列的86%,但无法区分谷氨酰胺/赖氨酸。通过将该算法应用于大约10 kDa大小的片段,如有限蛋白酶解产物,对更大蛋白质进行广泛测序应该是可行的。