Zhao Xiaoxiao, Hu Hao, Zhao Wensi, Liu Ping, Tan Minjia
School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China.
State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China.
Se Pu. 2022 Jan;40(1):17-27. doi: 10.3724/SP.J.1123.2021.03030.
As unique biomarkers, protein C-termini are involved in various biological processes such as protein trafficking, subcellular relocation, and signal transduction. Dysregulation of protein C-terminal status is critical during the development of various diseases, including cardiovascular, neurodegenerative, and metabolic diseases and cancer. Thus, global profiling of protein C-termini is of great value in providing mechanistic insight into biological or pathological processes, as well as for identifying potential new targets for therapeutic treatment. Polymer-based negative enrichment is a prominent C-terminomics strategy with advantages of universal applicability and parallel sample preparation. Compared with other methods of such a strategy, the profiling depth of the approaches based on enzymatic cleavage of Arg residues still needs to be improved. This greatly limits our understanding of the physiological functions and molecular mechanisms of C-termini. To add a more powerful tool for C-terminomics, Arg cleavage-based negative enrichment C-terminomics was optimized and evaluated. First, the sample preparation process was optimized. A one-pot enrichment platform based on a V-shaped filter was established, which reduced sample loss, avoided cross-contamination between reactions, and shortened sample preparation time. In addition, the protein-level acetylation conditions were investigated with the optimal labeling conditions as follows: triple coupling using 5 mmol/L Ac-NHS at pH 7.0 and 500 mmol/L ammonium for 15 min provided minimized acetylation rates (acetylation labeling efficiencies of Ser, Thr, and Tyr were lower than 4%, 2%, and 1%, respectively), along with the highest peptide-spectrum match number and satisfactory Lys labeling efficiency (up to 98%). These optimized conditions would not only minimize acetylation, but also facilitate the identification of C-terminal peptides. Second, it was speculated that the unexpected low identification rate was primarily caused by the interference of the large number of organic compounds accumulated during the peptide-level reactions, including reagents, organic buffering agents, and their complex side-reaction products. Therefore, the conditions for StageTip-based fractionation, including pH, the amount of Empore C18 beads, and the number of fractions, were optimized. As a result, by separating the sample enriched from 300 μg proteome into seven fractions, sample complexity was largely decreased and a total of 696 C-termini were identified in duplicates from strict data filtration, that is, percolator false discovery rate (FDR)<0.01, ion score≥20, and C-terminal amidation by ethanolamine. If only peptide FDR<0.01 was considered, the identified C-termini further increased to 933, which was among the largest C-terminome datasets obtained from the polymer-based strategy. Furthermore, compared with the results of a previous study, the optimized method would be a practical strategy for broader C-terminome coverage. Finally, to further broaden the coverage of the sub-C-terminome generated by Arg-specific cleavage, this study explored a new method in which ArgN-specific cleavage (cleavage at the N-terminal of Arg by LysargiNase) was combined with different N-terminal protections (dimethylation and acetylation). Among all the combinations, the additional use of the "LysargiNase+N-terminal acetylation" method increased 47% more identifications of unique C-termini on the basis of "trypsin+N-terminal demethylation" and the two covered 87% of the total C-termini. Therefore, the parallel use of the two methods would further expand the coverage of Arg-cleaved C-terminal peptides. With the analysis of the physicochemical properties of the peptides identified by the two methods, the reason why the C-terminal peptides identified by different strategies are complementary was explained. In conclusion, in this study, the optimized C-terminomics platform can deeply profile Arg cleavage-generated C-terminal peptides using a polymer-based approach. This method provides a powerful tool for the global characterization of protein C-termini.
作为独特的生物标志物,蛋白质C末端参与多种生物过程,如蛋白质运输、亚细胞定位和信号转导。蛋白质C末端状态的失调在包括心血管疾病、神经退行性疾病、代谢疾病和癌症在内的各种疾病的发展过程中至关重要。因此,对蛋白质C末端进行全面分析,对于深入了解生物学或病理过程的机制以及识别潜在的新治疗靶点具有重要价值。基于聚合物的负向富集是一种突出的C末端蛋白质组学策略,具有普遍适用性和平行样本制备的优点。与该策略的其他方法相比,基于精氨酸残基酶切的方法的分析深度仍有待提高。这极大地限制了我们对C末端生理功能和分子机制的理解。为了为C末端蛋白质组学增加一种更强大的工具,对基于精氨酸酶切的负向富集C末端蛋白质组学方法进行了优化和评估。首先,优化了样本制备过程。建立了基于V形过滤器的一锅式富集平台,该平台减少了样本损失,避免了反应之间的交叉污染,并缩短了样本制备时间。此外,研究了蛋白质水平的乙酰化条件,最佳标记条件如下:在pH 7.0和500 mmol/L氨存在下,使用5 mmol/L乙酰基-N-羟基琥珀酰亚胺进行三次偶联反应15分钟,可使乙酰化率降至最低(丝氨酸、苏氨酸和酪氨酸的乙酰化标记效率分别低于4%、2%和1%),同时获得最高的肽谱匹配数和令人满意的赖氨酸标记效率(高达98%)。这些优化条件不仅能使乙酰化最小化,还便于鉴定C末端肽段。其次,推测意外的低鉴定率主要是由肽段水平反应过程中积累的大量有机化合物干扰所致,这些有机化合物包括试剂、有机缓冲剂及其复杂的副反应产物。因此,对基于StageTip的分级分离条件进行了优化,包括pH值、Empore C18磁珠的用量和分级数。结果,通过将从300 μg蛋白质组中富集的样本分离成七个级分,样本复杂性大大降低,经过严格的数据过滤(即,percolator错误发现率(FDR)<0.01,离子得分≥20,以及通过乙醇胺进行C末端酰胺化),重复鉴定出总共696个C末端。如果仅考虑肽段FDR<0.01,则鉴定出的C末端进一步增加到933个,这是基于聚合物策略获得的最大的C末端蛋白质组数据集之一。此外,与先前研究的结果相比,优化后的方法将是实现更广泛C末端蛋白质组覆盖的实用策略。最后,为了进一步扩大由精氨酸特异性酶切产生的亚C末端蛋白质组的覆盖范围,本研究探索了一种新方法,即将精氨酸N特异性酶切(由赖氨酸精氨酸酶在精氨酸的N末端进行酶切)与不同的N末端保护(二甲基化和乙酰化)相结合。在所有组合中,额外使用“赖氨酸精氨酸酶+N末端乙酰化”方法,在“胰蛋白酶+N末端去甲基化”的基础上,独特C末端的鉴定增加了47%,二者覆盖了总C末端的87%。因此,两种方法并行使用将进一步扩大精氨酸酶切产生的C末端肽段的覆盖范围。通过分析两种方法鉴定出的肽段的理化性质,解释了不同策略鉴定出的C末端肽段互补的原因。总之,在本研究中,优化后的C末端蛋白质组学平台可以使用基于聚合物的方法对精氨酸酶切产生的C末端肽段进行深度分析。该方法为全面表征蛋白质C末端提供了一个强大的工具。