National Institute of Biological Sciences, Beijing, China.
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
Methods Mol Biol. 2022;2500:105-129. doi: 10.1007/978-1-0716-2325-1_9.
The remarkable advancement of top-down proteomics in the past decade is driven by the technological development in separation, mass spectrometry (MS) instrumentation, novel fragmentation, and bioinformatics. However, the accurate identification and quantification of proteoforms, all clearly-defined molecular forms of protein products from a single gene, remain a challenging computational task. This is in part due to the complicated mass spectra from intact proteoforms when compared to those from the digested peptides. Herein, pTop 2.0 is developed to fill in the gap between the large-scale complex top-down MS data and the shortage of high-accuracy bioinformatic tools. Compared with pTop 1.0, the first version, pTop 2.0 concentrates mainly on the identification of the proteoforms with unexpected modifications or a terminal truncation. The quantitation based on isotopic labeling is also a new function, which can be carried out by the convenient and user-friendly "one-key operation," integrated together with the qualitative identifications. The accuracy and running speed of pTop 2.0 is significantly improved on the test data sets. This chapter will introduce the main features, step-by-step running operations, and algorithmic developments of pTop 2.0 in order to push the identification and quantitation of intact proteoforms to a higher-accuracy level in top-down proteomics.
在过去的十年中,自上而下的蛋白质组学取得了显著的进展,这得益于分离、质谱(MS)仪器、新型碎片化和生物信息学的技术发展。然而,准确识别和定量蛋白质形式(所有明确的蛋白质产物的分子形式来自单个基因)仍然是一项具有挑战性的计算任务。部分原因是与来自消化肽的质谱相比,完整蛋白质形式的质谱更为复杂。本文开发了 pTop 2.0,以填补大规模复杂自上而下 MS 数据与缺乏高精度生物信息学工具之间的空白。与第一代 pTop 1.0 相比,第二代 pTop 2.0 主要专注于鉴定具有意外修饰或末端截断的蛋白质形式。基于同位素标记的定量也是一个新功能,可通过方便易用的“一键操作”进行,与定性鉴定集成在一起。pTop 2.0 在测试数据集上的准确性和运行速度都有了显著提高。本章将介绍 pTop 2.0 的主要功能、逐步运行操作和算法发展,以推动自上而下的蛋白质组学中完整蛋白质形式的鉴定和定量达到更高的精度水平。