• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用支持向量机(SVM)对串联质谱进行质量评估。

Quality assessment of tandem mass spectra using support vector machine (SVM).

作者信息

Zou An-Min, Wu Fang-Xiang, Ding Jia-Rui, Poirier Guy G

机构信息

Department of Mechanical Engineering, University of Saskatchewan, 57 Campus Dr, Saskatoon, SK, S7N 59A, Canada.

出版信息

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S49. doi: 10.1186/1471-2105-10-S1-S49.

DOI:10.1186/1471-2105-10-S1-S49
PMID:19208151
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2648784/
Abstract

BACKGROUND

Tandem mass spectrometry has become particularly useful for the rapid identification and characterization of protein components of complex biological mixtures. Powerful database search methods have been developed for the peptide identification, such as SEQUEST and MASCOT, which are implemented by comparing the mass spectra obtained from unknown proteins or peptides with theoretically predicted spectra derived from protein databases. However, the majority of spectra generated from a mass spectrometry experiment are of too poor quality to be interpreted while some of spectra with high quality cannot be interpreted by one method but perhaps by others. Hence a filtering algorithm that removes those spectra with poor quality prior to the database search is appealing.

RESULTS

This paper proposes a support vector machine (SVM) based approach to assess the quality of tandem mass spectra. Each mass spectrum is mapping into the 16 proposed features to describe its quality. Based the results from SEQUEST, four SVM classifiers with the input of the 16 features are trained and tested on ISB data and TOV data, respectively. The superior performance of the proposed SVM classifiers is illustrated both by the comparison with the existing classifiers and by the validation in terms of MASCOT search results.

CONCLUSION

The proposed method can be employed to effectively remove the poor quality spectra before the spectral searching, and also to find the more peptides or post-translational peptides from spectra with high quality using different search engines or de novo method.

摘要

背景

串联质谱对于快速鉴定和表征复杂生物混合物中的蛋白质成分尤为有用。已开发出强大的数据库搜索方法用于肽段鉴定,如SEQUEST和MASCOT,它们通过将从未知蛋白质或肽段获得的质谱与从蛋白质数据库导出的理论预测谱进行比较来实现。然而,质谱实验产生的大多数谱质量太差无法解释,而一些高质量的谱不能被一种方法解释,但可能被其他方法解释。因此,一种在数据库搜索之前去除低质量谱的过滤算法很有吸引力。

结果

本文提出一种基于支持向量机(SVM)的方法来评估串联质谱的质量。每个质谱被映射到16个提出的特征以描述其质量。基于SEQUEST的结果,分别在ISB数据和TOV数据上训练和测试了四个以16个特征为输入的SVM分类器。通过与现有分类器的比较以及根据MASCOT搜索结果进行的验证,说明了所提出的SVM分类器的优越性能。

结论

所提出的方法可用于在谱搜索之前有效地去除低质量谱,并且还可以使用不同的搜索引擎或从头测序方法从高质量谱中找到更多的肽段或翻译后肽段。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb45/2648784/13f9694596c0/1471-2105-10-S1-S49-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb45/2648784/7fe429dcde22/1471-2105-10-S1-S49-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb45/2648784/13f9694596c0/1471-2105-10-S1-S49-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb45/2648784/7fe429dcde22/1471-2105-10-S1-S49-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb45/2648784/13f9694596c0/1471-2105-10-S1-S49-3.jpg

相似文献

1
Quality assessment of tandem mass spectra using support vector machine (SVM).使用支持向量机(SVM)对串联质谱进行质量评估。
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S49. doi: 10.1186/1471-2105-10-S1-S49.
2
Charge state determination of peptide tandem mass spectra using support vector machine (SVM).使用支持向量机(SVM)进行肽串联质谱的电荷态测定。
IEEE Trans Inf Technol Biomed. 2010 May;14(3):552-8. doi: 10.1109/TITB.2010.2040287. Epub 2010 Jan 29.
3
SVM-RFE based feature selection for tandem mass spectrum quality assessment.基于支持向量机递归特征消除法的串联质谱质量评估特征选择
Int J Data Min Bioinform. 2011;5(1):73-88. doi: 10.1504/ijdmb.2011.038578.
4
In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.使用多个搜索引擎和明确的指标对蛋白质推断算法进行深入分析。
J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.
5
An SVM scorer for more sensitive and reliable peptide identification via tandem mass spectrometry.一种通过串联质谱进行更灵敏、可靠的肽段鉴定的支持向量机评分器。
Pac Symp Biocomput. 2006:303-14.
6
Quality assessment of peptide tandem mass spectra.肽串联质谱的质量评估
BMC Bioinformatics. 2008 May 28;9 Suppl 6(Suppl 6):S13. doi: 10.1186/1471-2105-9-S6-S13.
7
Support vector machines for improved peptide identification from tandem mass spectrometry database search.用于从串联质谱数据库搜索中改进肽段鉴定的支持向量机
Methods Mol Biol. 2009;492:453-60. doi: 10.1007/978-1-59745-493-3_28.
8
Protein Identification from Tandem Mass Spectra by Database Searching.通过数据库搜索从串联质谱中鉴定蛋白质。
Methods Mol Biol. 2017;1558:357-380. doi: 10.1007/978-1-4939-6783-4_17.
9
ScanRanker: Quality assessment of tandem mass spectra via sequence tagging.ScanRanker:通过序列标记对串联质谱进行质量评估。
J Proteome Res. 2011 Jul 1;10(7):2896-904. doi: 10.1021/pr200118r. Epub 2011 Apr 26.
10
Automatic validation of phosphopeptide identifications from tandem mass spectra.串联质谱中磷酸化肽段鉴定的自动验证
Anal Chem. 2007 Feb 15;79(4):1301-10. doi: 10.1021/ac061334v.

引用本文的文献

1
Predicting the Diagnostic Information of Tandem Mass Spectra of Environmentally Relevant Compounds Using Machine Learning.使用机器学习预测环境相关化合物串联质谱的诊断信息
Anal Chem. 2023 Oct 24;95(42):15810-15817. doi: 10.1021/acs.analchem.3c03470. Epub 2023 Oct 9.
2
Soil and leaf litter metaproteomics-a brief guideline from sampling to understanding.土壤和落叶层宏蛋白质组学——从采样到理解的简要指南
FEMS Microbiol Ecol. 2016 Nov;92(11). doi: 10.1093/femsec/fiw180. Epub 2016 Aug 21.
3
A critical assessment of feature selection methods for biomarker discovery in clinical proteomics.

本文引用的文献

1
An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.一种将肽的串联质谱数据与蛋白质数据库中氨基酸序列相关联的方法。
J Am Soc Mass Spectrom. 1994 Nov;5(11):976-89. doi: 10.1016/1044-0305(94)80016-2.
2
Quality assessment of peptide tandem mass spectra.肽串联质谱的质量评估
BMC Bioinformatics. 2008 May 28;9 Suppl 6(Suppl 6):S13. doi: 10.1186/1471-2105-9-S6-S13.
3
Morphological grayscale reconstruction in image analysis: applications and efficient algorithms.图像分析中的形态学灰度重建:应用与高效算法。
临床蛋白质组学中生物标志物发现的特征选择方法的批判性评估。
Mol Cell Proteomics. 2013 Jan;12(1):263-76. doi: 10.1074/mcp.M112.022566. Epub 2012 Oct 31.
IEEE Trans Image Process. 1993;2(2):176-201. doi: 10.1109/83.217222.
4
msmsEval: tandem mass spectral quality assignment for high-throughput proteomics.MSMS评估:用于高通量蛋白质组学的串联质谱质量分配
BMC Bioinformatics. 2007 Feb 9;8:51. doi: 10.1186/1471-2105-8-51.
5
Quality assessment of tandem mass spectra based on cumulative intensity normalization.基于累积强度归一化的串联质谱质量评估
J Proteome Res. 2006 Dec;5(12):3241-8. doi: 10.1021/pr0603248.
6
Improving the reliability and throughput of mass spectrometry-based proteomics by spectrum quality filtering.通过谱图质量过滤提高基于质谱的蛋白质组学的可靠性和通量。
Proteomics. 2006 Apr;6(7):2086-94. doi: 10.1002/pmic.200500309.
7
Quality classification of tandem mass spectrometry data.串联质谱数据的质量分类
Bioinformatics. 2006 Feb 15;22(4):400-6. doi: 10.1093/bioinformatics/bti829. Epub 2005 Dec 13.
8
Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides.鸟枪法蛋白质组学数据的动态谱质量评估与迭代计算分析:迈向更高效地鉴定翻译后修饰、序列多态性和新型肽段
Mol Cell Proteomics. 2006 Apr;5(4):652-70. doi: 10.1074/mcp.M500319-MCP200. Epub 2005 Dec 12.
9
Proteome profiling of human epithelial ovarian cancer cell line TOV-112D.人上皮性卵巢癌细胞系TOV-112D的蛋白质组分析
Mol Cell Biochem. 2005 Jul;275(1-2):25-55. doi: 10.1007/s11010-005-7556-1.
10
Spectral quality assessment for high-throughput tandem mass spectrometry proteomics.高通量串联质谱蛋白质组学的光谱质量评估
OMICS. 2004 Fall;8(3):255-65. doi: 10.1089/omi.2004.8.255.