• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于局部敏感哈希的快速、低内存消耗光谱库搜索算法

A Fast and Memory-Efficient Spectral Library Search Algorithm Using Locality-Sensitive Hashing.

机构信息

School of Informatics and Computing, Indiana University, Bloomington, IN, 47405, USA.

出版信息

Proteomics. 2020 Nov;20(21-22):e2000002. doi: 10.1002/pmic.202000002. Epub 2020 Jun 29.

DOI:10.1002/pmic.202000002
PMID:32415809
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7669687/
Abstract

With the accumulation of MS/MS spectra collected in spectral libraries, the spectral library searching approach emerges as an important approach for peptide identification in proteomics, complementary to the commonly used protein database searching approach, in particular for the proteomic analyses of well-studied model organisms, such as human. Existing spectral library searching algorithms compare a query MS/MS spectrum with each spectrum in the library with matched precursor mass and charge state, which may become computationally intensive with the rapidly growing library size. Here, the software msSLASH, which implements a fast spectral library searching algorithm based on the Locality-Sensitive Hashing (LSH) technique, is presented. The algorithm first converts the library and query spectra into bit-strings using LSH functions, and then computes the similarity between the spectra with highly similar bit-string. Using the spectral library searching of large real-world MS/MS spectra datasets, it is demonstrated that the algorithm significantly reduced the number of spectral comparisons, and as a result, achieved 2-9X speedup in comparison with existing spectral library searching algorithm SpectraST. The spectral searching algorithm is implemented in C/C++, and is ready to be used in proteomic data analyses.

摘要

随着在光谱库中积累的 MS/MS 光谱数量的增加,光谱库搜索方法作为蛋白质组学中肽鉴定的一种重要方法,与常用的蛋白质数据库搜索方法相辅相成,特别是对于研究良好的模式生物(如人类)的蛋白质组学分析。现有的光谱库搜索算法将查询 MS/MS 光谱与库中每个具有匹配前体质量和电荷状态的光谱进行比较,随着库规模的快速增长,这可能会变得计算密集。这里介绍了一种名为 msSLASH 的软件,它实现了一种基于局部敏感哈希(LSH)技术的快速光谱库搜索算法。该算法首先使用 LSH 函数将库和查询光谱转换为位字符串,然后使用高度相似的位字符串计算光谱之间的相似性。通过对大型真实世界 MS/MS 光谱数据集的光谱搜索,证明该算法显著减少了光谱比较的数量,与现有的光谱库搜索算法 SpectraST 相比,速度提高了 2-9 倍。光谱搜索算法是用 C/C++ 实现的,准备用于蛋白质组学数据分析。

相似文献

1
A Fast and Memory-Efficient Spectral Library Search Algorithm Using Locality-Sensitive Hashing.基于局部敏感哈希的快速、低内存消耗光谱库搜索算法
Proteomics. 2020 Nov;20(21-22):e2000002. doi: 10.1002/pmic.202000002. Epub 2020 Jun 29.
2
msCRUSH: Fast Tandem Mass Spectral Clustering Using Locality Sensitive Hashing.msCRUSH:基于局部敏感哈希的快速串联质谱聚类。
J Proteome Res. 2019 Jan 4;18(1):147-158. doi: 10.1021/acs.jproteome.8b00448. Epub 2018 Dec 14.
3
Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing.快速开放修改谱库搜索通过近似最近邻索引。
J Proteome Res. 2018 Oct 5;17(10):3463-3474. doi: 10.1021/acs.jproteome.8b00359. Epub 2018 Sep 13.
4
Artificial decoy spectral libraries for false discovery rate estimation in spectral library searching in proteomics.用于蛋白质组学中基于光谱库搜索的错误发现率估计的人工诱饵光谱库。
J Proteome Res. 2010 Jan;9(1):605-10. doi: 10.1021/pr900947u.
5
Building and searching tandem mass spectral libraries for peptide identification.构建和搜索串联质谱文库以进行肽鉴定。
Mol Cell Proteomics. 2011 Dec;10(12):R111.008565. doi: 10.1074/mcp.R111.008565. Epub 2011 Sep 6.
6
Building and searching tandem mass (MS/MS) spectral libraries for peptide identification in proteomics.建立和搜索串联质谱 (MS/MS) 谱库,用于蛋白质组学中的肽鉴定。
Methods. 2011 Aug;54(4):424-31. doi: 10.1016/j.ymeth.2011.01.007. Epub 2011 Jan 28.
7
Development and validation of a spectral library searching method for peptide identification from MS/MS.用于从串联质谱(MS/MS)中鉴定肽段的光谱库搜索方法的开发与验证。
Proteomics. 2007 Mar;7(5):655-67. doi: 10.1002/pmic.200600625.
8
Spectral library searching in proteomics.蛋白质组学中的光谱库搜索
Proteomics. 2016 Mar;16(5):729-40. doi: 10.1002/pmic.201500296. Epub 2016 Feb 9.
9
Using spectral libraries for peptide identification from tandem mass spectrometry (MS/MS) data.利用光谱库从串联质谱(MS/MS)数据中鉴定肽段。
Curr Protoc Protein Sci. 2010 Apr;Chapter 25:25.5.1-25.5.9. doi: 10.1002/0471140864.ps2505s60.
10
Spectral library searching for peptide identification via tandem MS.通过串联质谱进行肽段鉴定的光谱库搜索。
Methods Mol Biol. 2010;604:95-103. doi: 10.1007/978-1-60761-444-9_7.

引用本文的文献

1
A crustacean neuropeptide spectral library for data-independent acquisition (DIA) mass spectrometry applications.用于数据非依赖性采集 (DIA) 质谱应用的甲壳纲动物神经肽谱文库。
Proteomics. 2024 Aug;24(15):e2300285. doi: 10.1002/pmic.202300285. Epub 2024 Jan 3.
2
Accurate de novo peptide sequencing using fully convolutional neural networks.利用全卷积神经网络进行精确从头肽测序。
Nat Commun. 2023 Dec 2;14(1):7974. doi: 10.1038/s41467-023-43010-x.
3
Mistle: bringing spectral library predictions to metaproteomics with an efficient search index.

本文引用的文献

1
Full-Spectrum Prediction of Peptides Tandem Mass Spectra using Deep Neural Network.使用深度神经网络进行肽串联质谱的全谱预测。
Anal Chem. 2020 Mar 17;92(6):4275-4283. doi: 10.1021/acs.analchem.9b04867. Epub 2020 Feb 25.
2
Removing the Hidden Data Dependency of DIA with Predicted Spectral Libraries.利用预测谱库去除 DIA 的隐藏数据依赖性。
Proteomics. 2020 Feb;20(3-4):e1900306. doi: 10.1002/pmic.201900306. Epub 2020 Feb 5.
3
msCRUSH: Fast Tandem Mass Spectral Clustering Using Locality Sensitive Hashing.msCRUSH:基于局部敏感哈希的快速串联质谱聚类。
Mistle:利用高效搜索索引将光谱库预测引入宏蛋白质组学。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad376.
4
AIomics: Exploring More of the Proteome Using Mass Spectral Libraries Extended by Artificial Intelligence.人工智能组学:利用人工智能扩展的质谱文库探索更多蛋白质组。
J Proteome Res. 2023 Jul 7;22(7):2246-2255. doi: 10.1021/acs.jproteome.2c00807. Epub 2023 May 26.
5
GlycoSLASH: Concurrent Glycopeptide Identification from Multiple Related LC-MS/MS Data Sets by Using Spectral Clustering and Library Searching.GlycoSLASH:通过光谱聚类和库检索从多个相关 LC-MS/MS 数据集同时鉴定糖肽。
J Proteome Res. 2023 May 5;22(5):1501-1509. doi: 10.1021/acs.jproteome.3c00066. Epub 2023 Feb 21.
6
Locality-sensitive hashing enables efficient and scalable signal classification in high-throughput mass spectrometry raw data.基于位置敏感哈希的方法能够高效、大规模地对高通量质谱原始数据中的信号进行分类。
BMC Bioinformatics. 2022 Jul 20;23(1):287. doi: 10.1186/s12859-022-04833-5.
J Proteome Res. 2019 Jan 4;18(1):147-158. doi: 10.1021/acs.jproteome.8b00448. Epub 2018 Dec 14.
4
Fast Open Modification Spectral Library Searching through Approximate Nearest Neighbor Indexing.快速开放修改谱库搜索通过近似最近邻索引。
J Proteome Res. 2018 Oct 5;17(10):3463-3474. doi: 10.1021/acs.jproteome.8b00359. Epub 2018 Sep 13.
5
Assembling the Community-Scale Discoverable Human Proteome.组装社区规模可发现的人类蛋白质组。
Cell Syst. 2018 Oct 24;7(4):412-421.e5. doi: 10.1016/j.cels.2018.08.004. Epub 2018 Aug 29.
6
Prostate cancer proteomics: clinically useful protein biomarkers and future perspectives.前列腺癌蛋白质组学:具有临床应用价值的蛋白质生物标志物及未来展望。
Expert Rev Proteomics. 2018 Jan;15(1):65-79. doi: 10.1080/14789450.2018.1417846. Epub 2017 Dec 20.
7
Extending a Tandem Mass Spectral Library to Include MS Spectra of Fragment Ions Produced In-Source and MS Spectra.将串联质谱文库扩展到包括在源内产生的碎片离子的 MS 谱和 MS 谱。
J Am Soc Mass Spectrom. 2017 Nov;28(11):2280-2287. doi: 10.1007/s13361-017-1748-2. Epub 2017 Jul 18.
8
Identification of small molecules using accurate mass MS/MS search.利用精确质量 MS/MS 搜索鉴定小分子。
Mass Spectrom Rev. 2018 Jul;37(4):513-532. doi: 10.1002/mas.21535. Epub 2017 Apr 24.
9
MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics.MSFragger:基于质谱的蛋白质组学中实现超快速且全面的肽段鉴定
Nat Methods. 2017 May;14(5):513-520. doi: 10.1038/nmeth.4256. Epub 2017 Apr 10.
10
Building ProteomeTools based on a complete synthetic human proteome.基于完整的合成人类蛋白质组构建蛋白质组工具。
Nat Methods. 2017 Mar;14(3):259-262. doi: 10.1038/nmeth.4153. Epub 2017 Jan 30.