Suppr超能文献

PB-Net:用于多反应监测的基于序列深度学习的自动峰积分

PB-Net: Automatic peak integration by sequential deep learning for multiple reaction monitoring.

作者信息

Wu Zhenqin, Serie Daniel, Xu Gege, Zou James

机构信息

InterVenn Biosciences, United States of America; Department of Chemistry, Stanford University, United States of America.

InterVenn Biosciences, United States of America.

出版信息

J Proteomics. 2020 Jul 15;223:103820. doi: 10.1016/j.jprot.2020.103820. Epub 2020 May 13.

Abstract

Mass spectrometry (MS) based proteomics has become an indispensable component of modern molecular and cellular biochemistry analysis. Multiple reaction monitoring (MRM) is one of the most well-established MS techniques for molecule detection and quantification. Despite its wide usage, there lacks an accurate computational framework to analyze MRM data, and expert annotation is often required, especially to perform peak integration. Here we propose a deep learning method PB-Net (Peak Boundary Neural Network), built upon recent advances in sequential neural networks, for fully automatic chromatographic peak integration. To train PB-Net, we generated a large dataset of over 170,000 expert annotated peaks from MS transitions spanning a wide dynamic range, including both peptides and intact glycopeptides. Our model demonstrated outstanding performances on unseen test samples, reaching near-perfect agreement (Pearson's r 0.997) with human annotated ground truth. Systematic evaluations also show that PB-Net is substantially more robust and accurate compared to previous state-of-the-art peak integration software. PB-Net can benefit the wide community of mass spectrometry data analysis, especially in applications involving high-throughput MS experiments. Codes and test data used in this work are available at https://github.com/miaecle/PB-net. SIGNIFICANCE: Human annotations serve an important role in accurate quantification of multiple reaction monitoring (MRM) experiments, though they are costly to collect and limit analysis throughput. In this work we proposed and developed a novel technique for the peak-integration step in MRM, based on recent innovations in sequential deep learning models. We collected in total 170,000 expert-annotated MRM peaks and trained a set of accurate and robust neural networks for the task. Results demonstrated a substantial improvement over the current state-of-the-art software for mass spectrometry analysis and comparable level of accuracy and precision as human annotators.

摘要

基于质谱(MS)的蛋白质组学已成为现代分子和细胞生物化学分析中不可或缺的一部分。多反应监测(MRM)是分子检测和定量方面最成熟的质谱技术之一。尽管其应用广泛,但缺乏一个准确的计算框架来分析MRM数据,通常需要专家注释,尤其是在进行峰积分时。在此,我们基于序列神经网络的最新进展,提出了一种深度学习方法PB-Net(峰边界神经网络),用于全自动色谱峰积分。为了训练PB-Net,我们从跨越广泛动态范围的MS跃迁中生成了一个超过170,000个专家注释峰的大型数据集,包括肽段和完整糖肽。我们的模型在未见测试样本上表现出色,与人类注释的真实值达成了近乎完美的一致性(皮尔逊相关系数r为0.997)。系统评估还表明,与之前的最先进峰积分软件相比,PB-Net更加稳健和准确。PB-Net可使广大质谱数据分析群体受益,特别是在涉及高通量MS实验的应用中。本研究中使用的代码和测试数据可在https://github.com/miaecle/PB-net获取。意义:人类注释在多反应监测(MRM)实验的准确定量中起着重要作用,尽管收集成本高昂且限制了分析通量。在这项工作中,我们基于序列深度学习模型的最新创新,提出并开发了一种用于MRM峰积分步骤的新技术。我们总共收集了170,000个专家注释的MRM峰,并为此任务训练了一组准确且稳健的神经网络。结果表明,与当前用于质谱分析的最先进软件相比有显著改进,并且在准确性和精密度方面与人类注释者相当。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验