Anal Chem. 2019 Jul 2;91(13):8492-8499. doi: 10.1021/acs.analchem.9b01625. Epub 2019 Jun 14.
Covalent labeling with mass spectrometry (CL-MS) provides a direct measure of the chemical and structural features of proteins with the potential for resolution at the amino-acid level. Unfortunately, most applications of CL-MS are limited to narrowly defined differential analyses, where small numbers of residues are compared between two or more protein states. Extending the utility of high-resolution CL-MS for structure-based applications requires more robust computational routines and the development of methodology capable of reporting of labeling yield accurately. Here, we provide a substantial improvement in the analysis of CL-MS data with the development of an extended plug-in built within the Mass Spec Studio development framework (MSS-CLEAN). All elements of data analysis-from database search to site-resolved and normalized labeling output-are accommodated, as illustrated through the nonselective labeling of the human kinesin Eg5 with photoconverted 3,3'-azibutan-1-ol. In developing the new features within the CL-MS plug-in, we identified additional complexities associated with the application of CL reagents, arising primarily from digestion-induced bias in yield measurements and ambiguities in site localization. A strategy is presented involving the use of redundant site labeling data from overlapping peptides, the imputation of missing data, and a normalization routine to determine relative protection factors. These elements together provide for a robust structural interpretation of CL-MS/MS data while minimizing the over-reporting of labeling site resolution. Finally, to minimize bias, we recommend that digestion strategies for the generation of useful overlapping peptides involve the application of complementary enzymes that drive digestion to completion.
通过质谱(CL-MS)进行共价标记提供了对蛋白质的化学和结构特征的直接测量,具有在氨基酸水平上进行分辨率的潜力。不幸的是,CL-MS 的大多数应用都仅限于狭义的差异分析,其中在两种或更多蛋白质状态之间比较少数残基。要扩展高分辨率 CL-MS 在基于结构的应用中的用途,需要更强大的计算例程和能够准确报告标记产率的方法的开发。在这里,我们通过在 Mass Spec Studio 开发框架(MSS-CLEAN)内构建的扩展插件,对 CL-MS 数据的分析进行了重大改进。数据分析的所有元素 - 从数据库搜索到站点解析和归一化标记输出 - 都得到了适应,如图所示,通过非选择性标记人类驱动蛋白 Eg5 与光转化的 3,3'-azibutan-1-ol。在开发 CL-MS 插件中的新功能时,我们确定了与 CL 试剂应用相关的其他复杂性,主要来自于消化诱导的产率测量偏差和站点定位的模糊性。提出了一种涉及使用来自重叠肽的冗余站点标记数据、缺失数据的插补以及确定相对保护因子的归一化例程的策略。这些元素共同为 CL-MS/MS 数据的稳健结构解释提供了依据,同时最大限度地减少了标记位点分辨率的过度报告。最后,为了最小化偏差,我们建议用于生成有用重叠肽的消化策略涉及应用互补酶,这些酶可以驱动消化完成。