School of Informatics and Computing, Indiana University-Purdue University Indianapolis, Indianapolis, Indiana 46202, United States.
Department of Biomedical Informatics and Department of Computer Science and Engineering, Ohio State University, Columbus, Ohio 43210, United States.
Anal Chem. 2020 Jun 2;92(11):7778-7785. doi: 10.1021/acs.analchem.0c00903. Epub 2020 May 13.
Top-down mass spectrometry has become the main method for intact proteoform identification, characterization, and quantitation. Because of the complexity of top-down mass spectrometry data, spectral deconvolution is an indispensable step in spectral data analysis, which groups spectral peaks into isotopic envelopes and extracts monoisotopic masses of precursor or fragment ions. The performance of spectral deconvolution methods relies heavily on their scoring functions, which distinguish correct envelopes from incorrect ones. A good scoring function increases the accuracy of deconvoluted masses reported from mass spectra. In this paper, we present EnvCNN, a convolutional neural network-based model for evaluating isotopic envelopes. We show that the model outperforms other scoring functions in distinguishing correct envelopes from incorrect ones and that it increases the number of identifications and improves the statistical significance of identifications in top-down spectral interpretation.
自上而下的质谱法已成为鉴定、描述和定量完整蛋白质形式的主要方法。由于自上而下的质谱数据的复杂性,谱分解是谱数据分析中不可或缺的一步,它将谱峰分组到同位素包络中,并提取前体或碎片离子的单一同位素质量。谱分解方法的性能很大程度上取决于其评分函数,该函数可将正确的包络与不正确的包络区分开来。一个好的评分函数可以提高从质谱报告的解卷积质量的准确性。在本文中,我们提出了基于卷积神经网络的 EnvCNN 模型,用于评估同位素包络。我们表明,该模型在区分正确的包络和错误的包络方面优于其他评分函数,并且它增加了鉴定数量,并提高了自上而下的光谱解释中鉴定的统计显著性。