Suppr超能文献

基于比对图的蛋白质异构体鉴定与定量分析

Proteoform identification and quantification based on alignment graphs.

作者信息

Zhan Zhaohui, Wang Lusheng

机构信息

Department of Engineering, Shenzhen MSU-BIT University, Shenzhen, 518172, China.

Department of Computer Science, City University of Hong Kong, Hong Kong, 999077, China.

出版信息

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf007.

Abstract

MOTIVATION

Proteoforms are the different forms of a proteins generated from the genome with various sequence variations, splice isoforms, and post-translational modifications. Proteoforms regulate protein structures and functions. A single protein can have multiple proteoforms due to different modification sites. Proteoform identification is to find proteoforms of a given protein that best fits the input spectrum. Proteoform quantification is to find the corresponding abundances of different proteoforms for a specific protein.

RESULTS

We proposed algorithms for proteoform identification and quantification based on the top-down tandem mass spectrum. In the combination alignments of the HomMTM spectrum and the reference protein, we need to give a correction of the mass for each matched peak within the pre-defined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the protein is identical to that of the corresponding two matched peaks in the HomMTM spectrum. We design a back-tracking graph to store such kind of information and find a combinatorial path (k paths) with the minimum sum of peak intensity error in this back-tracking graph. The obtained alignment can also show the relative abundance of these proteoforms (paths). Our experimental results demonstrate the algorithm's capability to identify and quantify proteoform combinations encompassing a greater number of peaks. This advancement holds promise for enhancing the accuracy and comprehensiveness of proteoform quantification, addressing a crucial need in the field of top-down MS-based proteomics.

AVAILABILITY AND IMPLEMENTATION

The software package are available at https://github.com/Zeirdo/TopMGQuant.

摘要

动机

蛋白质异构体是由基因组产生的具有各种序列变异、剪接异构体和翻译后修饰的蛋白质的不同形式。蛋白质异构体调节蛋白质的结构和功能。由于修饰位点不同,单个蛋白质可以有多种蛋白质异构体。蛋白质异构体鉴定是找到与输入质谱最匹配的给定蛋白质的蛋白质异构体。蛋白质异构体定量是找到特定蛋白质的不同蛋白质异构体的相应丰度。

结果

我们提出了基于自上而下串联质谱的蛋白质异构体鉴定和定量算法。在同源多电荷质谱(HomMTM)谱与参考蛋白质的组合比对中,我们需要在预定义的误差范围内对每个匹配峰的质量进行校正。校正后,我们规定蛋白质中任意两个(不一定连续)匹配节点之间的质量与HomMTM谱中相应的两个匹配峰的质量相同。我们设计了一个回溯图来存储此类信息,并在该回溯图中找到峰强度误差总和最小的组合路径(k条路径)。得到的比对结果还可以显示这些蛋白质异构体(路径)的相对丰度。我们的实验结果证明了该算法识别和定量包含更多峰的蛋白质异构体组合的能力。这一进展有望提高蛋白质异构体定量的准确性和全面性,满足基于自上而下质谱的蛋白质组学领域的关键需求。

可用性和实现方式

软件包可在https://github.com/Zeirdo/TopMGQuant获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c50/11769674/9529bf2d5dd3/btaf007f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验