Medical Information Center, Seoul National University Hospital, Seoul 110-744, Korea.
J Proteome Res. 2012 Sep 7;11(9):4488-98. doi: 10.1021/pr300232y. Epub 2012 Aug 13.
Selenoproteins, containing selenocysteine (Sec, U) as the 21st amino acid in the genetic code, are well conserved from bacteria to human, except yeast and higher plants that miss the Sec insertion machinery. Determination of Sec association is important to find substrates and to understand redox action of selenoproteins. While mass spectrometry (MS) has become a common and powerful tool to determine an amino acid sequence of a protein, identification of a protein sequence containing Sec was not easy using MS because of the limited stability of Sec in selenoproteins. Se has six naturally occurring isotopes, ⁷⁴Se, ⁷⁶Se, ⁷⁷Se, ⁷⁸Se, ⁸⁰Se, and ⁸²Se, and ⁸⁰Se is the most abundant isotope. These characteristics provide a good indicator for selenopeptides but make it difficult to detect selenopeptides using software analysis tools developed for common peptides. Thus, previous reports verified MS scans of selenopeptides by manual inspection. None of the fully automated algorithms have taken into account the isotopes of Se, leading to the wrong interpretation for selenopeptides. In this paper, we present an algorithm to determine monoisotopic masses of selenocysteine-containing polypeptides. Our algorithm is based on a theoretical model for an isotopic distribution of a selenopeptide, which regards peak intensities in an isotopic distribution as the natural abundances of C, H, N, O, S, and Se. Our algorithm uses two kinds of isotopic peak intensity ratios: one for two adjacent peaks and another for two distant peaks. It is shown that our algorithm for selenopeptides performs accurately, which was demonstrated with two LC-MS/MS data sets. Using this algorithm, we have successfully identified the Sec-Cys and Sec-Sec cross-linking of glutaredoxin 1 (GRX1) from mass spectra obtained by UPLC-ESI-q-TOF instrument.
硒蛋白含有硒代半胱氨酸(Sec,U)作为遗传密码中的第 21 种氨基酸,从细菌到人类都得到很好的保守,除了酵母和高等植物缺失 Sec 插入机制。确定 Sec 的结合对于发现底物和理解硒蛋白的氧化还原作用非常重要。虽然质谱(MS)已成为确定蛋白质氨基酸序列的常用且强大的工具,但由于硒蛋白中 Sec 的稳定性有限,使用 MS 鉴定含有 Sec 的蛋白质序列并不容易。硒有六种天然存在的同位素,分别是 ⁷⁴Se、⁷⁶Se、⁷⁷Se、⁷⁸Se、⁸⁰Se 和 ⁸²Se,其中 ⁸⁰Se 是最丰富的同位素。这些特征为硒肽提供了一个很好的指标,但由于开发用于常见肽的软件分析工具很难检测硒肽,因此使用这些特征会使检测变得困难。因此,以前的报告通过手动检查验证了硒肽的 MS 扫描。之前没有任何完全自动化的算法考虑到 Se 的同位素,这导致对硒肽的错误解释。在本文中,我们提出了一种确定含硒半胱氨酸多肽的单同位素质量的算法。我们的算法基于硒肽同位素分布的理论模型,该模型将同位素分布中的峰强度视为 C、H、N、O、S 和 Se 的自然丰度。我们的算法使用两种类型的同位素峰强度比:一种用于两个相邻的峰,另一种用于两个遥远的峰。结果表明,我们的硒肽算法具有很高的准确性,通过两个 LC-MS/MS 数据集得到了证明。使用该算法,我们成功地从 UPLC-ESI-q-TOF 仪器获得的质谱中鉴定了谷胱甘肽还原酶 1(GRX1)的 Sec-Cys 和 Sec-Sec 交联。