用于生物样本分类的多维蛋白质鉴定技术（MudPIT）数据的可用性。

Availability of MudPIT data for classification of biological samples.

作者信息

Silvestre Dario Di, Zoppis Italo, Brambilla Francesca, Bellettato Valeria, Mauri Giancarlo, Mauri Pierluigi

机构信息

, Institute for Biomedical Technologies (ITB-CNR), via F.lli Cervi 93, Segrate (Milan), Italy.

Department of Informatics, Systems and Communication, Viale Sarca 336, University of Milano-Bicocca, Milan, Italy.

出版信息

J Clin Bioinforma. 2013 Jan 14;3(1):1. doi: 10.1186/2043-9113-3-1.

DOI:10.1186/2043-9113-3-1

PMID:23317455

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3563498/

Abstract

BACKGROUND

Mass spectrometry is an important analytical tool for clinical proteomics. Primarily employed for biomarker discovery, it is increasingly used for developing methods which may help to provide unambiguous diagnosis of biological samples. In this context, we investigated the classification of phenotypes by applying support vector machine (SVM) on experimental data obtained by MudPIT approach. In particular, we compared the performance capabilities of SVM by using two independent collection of complex samples and different data-types, such as mass spectra (m/z), peptides and proteins.

RESULTS

Globally, protein and peptide data allowed a better discriminant informative content than experimental mass spectra (overall accuracy higher than 87% in both collection 1 and 2). These results indicate that sequencing of peptides and proteins reduces the experimental noise affecting the raw mass spectra, and allows the extraction of more informative features available for the effective classification of samples. In addition, proteins and peptides features selected by SVM matched for 80% with the differentially expressed proteins identified by the MAProMa software.

CONCLUSIONS

These findings confirm the availability of the most label-free quantitative methods based on processing of spectral count and SEQUEST-based SCORE values. On the other hand, it stresses the usefulness of MudPIT data for a correct grouping of sample phenotypes, by applying both supervised and unsupervised learning algorithms. This capacity permit the evaluation of actual samples and it is a good starting point to translate proteomic methodology to clinical application.

摘要

背景

质谱分析法是临床蛋白质组学的一项重要分析工具。它主要用于生物标志物的发现，并且越来越多地被用于开发有助于对生物样本进行明确诊断的方法。在此背景下，我们通过将支持向量机（SVM）应用于通过多维蛋白质鉴定技术（MudPIT）方法获得的实验数据，来研究表型的分类。特别是，我们通过使用两个独立的复杂样本集合以及不同的数据类型（如质谱（m/z）、肽和蛋白质）来比较支持向量机的性能。

结果

总体而言，蛋白质和肽数据比实验质谱具有更好的判别信息含量（在集合1和集合2中总体准确率均高于87%）。这些结果表明，肽和蛋白质的测序减少了影响原始质谱的实验噪声，并允许提取更多可用于有效分类样本的信息特征。此外，支持向量机选择的蛋白质和肽特征与MAProMa软件鉴定的差异表达蛋白质有80%的匹配度。

结论

这些发现证实了基于光谱计数处理和基于SEQUEST的得分值的最无标记定量方法的可用性。另一方面，它强调了通过应用监督和无监督学习算法，MudPIT数据对于正确分组样本表型的有用性。这种能力允许对实际样本进行评估，并且是将蛋白质组学方法转化为临床应用的一个良好起点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/52dd/3563498/d226464414e4/2043-9113-3-1-1.jpg

相似文献

Availability of MudPIT data for classification of biological samples.用于生物样本分类的多维蛋白质鉴定技术（MudPIT）数据的可用性。

J Clin Bioinforma. 2013 Jan 14;3(1):1. doi: 10.1186/2043-9113-3-1.

A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores.蛋白质组学中鸟枪法肽段测序评估的一种新算法：肽段串联质谱谱图与SEQUEST评分的支持向量机分类

J Proteome Res. 2003 Mar-Apr;2(2):137-46. doi: 10.1021/pr0255654.

ProtQuant: a tool for the label-free quantification of MudPIT proteomics data.ProtQuant：一种用于MudPIT蛋白质组学数据无标记定量的工具。

BMC Bioinformatics. 2007 Nov 1;8 Suppl 7(Suppl 7):S24. doi: 10.1186/1471-2105-8-S7-S24.

Quality assessment of tandem mass spectra using support vector machine (SVM).使用支持向量机（SVM）对串联质谱进行质量评估。

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S49. doi: 10.1186/1471-2105-10-S1-S49.

Targeted Feature Detection for Data-Dependent Shotgun Proteomics.针对数据依赖型鸟枪法蛋白质组学的靶向特征检测。

J Proteome Res. 2017 Aug 4;16(8):2964-2974. doi: 10.1021/acs.jproteome.7b00248. Epub 2017 Jul 19.

Functional genomics and proteomics in the clinical neurosciences: data mining and bioinformatics.临床神经科学中的功能基因组学和蛋白质组学：数据挖掘与生物信息学

Prog Brain Res. 2006;158:83-108. doi: 10.1016/S0079-6123(06)58004-5.

The APEX Quantitative Proteomics Tool: generating protein quantitation estimates from LC-MS/MS proteomics results.APEX定量蛋白质组学工具：从液相色谱-串联质谱蛋白质组学结果生成蛋白质定量估计值。

BMC Bioinformatics. 2008 Dec 9;9:529. doi: 10.1186/1471-2105-9-529.

Prediction of peptides observable by mass spectrometry applied at the experimental set level.在实验装置水平上应用质谱法对可观测肽段的预测。

BMC Bioinformatics. 2007 Nov 1;8 Suppl 7(Suppl 7):S23. doi: 10.1186/1471-2105-8-S7-S23.

Finding diagnostic biomarkers in proteomic spectra.在蛋白质组学光谱中寻找诊断生物标志物。

Pac Symp Biocomput. 2006:279-90.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

引用本文的文献

A Support Vector Machine Classification of Thyroid Bioptic Specimens Using MALDI-MSI Data.基于基质辅助激光解吸电离质谱成像（MALDI-MSI）数据的甲状腺活检标本支持向量机分类

Adv Bioinformatics. 2016;2016:3791214. doi: 10.1155/2016/3791214. Epub 2016 May 17.

Label-Free LC-MS/MS Proteomic Analysis of Cerebrospinal Fluid Identifies Protein/Pathway Alterations and Candidate Biomarkers for Amyotrophic Lateral Sclerosis.脑脊液的无标记液相色谱-串联质谱蛋白质组学分析确定了肌萎缩侧索硬化症的蛋白质/信号通路改变及候选生物标志物

J Proteome Res. 2015 Nov 6;14(11):4486-501. doi: 10.1021/acs.jproteome.5b00804. Epub 2015 Oct 8.

本文引用的文献

Targeted proteomic quantification on quadrupole-orbitrap mass spectrometer.四极杆轨道阱质谱仪上的靶向蛋白质组定量分析。

Mol Cell Proteomics. 2012 Dec;11(12):1709-23. doi: 10.1074/mcp.O112.019802. Epub 2012 Sep 7.

Proteomic biomarkers predicting lymph node involvement in serum of cervical cancer patients. Limitations of SELDI-TOF MS.蛋白质组生物标志物预测宫颈癌患者血清中淋巴结受累情况。SELDI-TOF MS 的局限性。

Proteome Sci. 2012 Jun 13;10(1):41. doi: 10.1186/1477-5956-10-41.

Candidate biomarker verification: Critical examination of a serum protein pattern for human colorectal cancer.候选生物标志物验证：对人结直肠癌血清蛋白图谱的批判性研究。

Proteomics Clin Appl. 2012 Apr;6(3-4):182-9. doi: 10.1002/prca.201100095.

A classification method based on principal components of SELDI spectra to diagnose of lung adenocarcinoma.基于 SELDI 谱主成分的分类方法诊断肺腺癌。

PLoS One. 2012;7(3):e34457. doi: 10.1371/journal.pone.0034457. Epub 2012 Mar 26.

Isoelectric point optimization using peptide descriptors and support vector machines.使用肽描述符和支持向量机进行等电点优化。

J Proteomics. 2012 Apr 3;75(7):2269-74. doi: 10.1016/j.jprot.2012.01.029. Epub 2012 Feb 3.

MudPIT analysis of released proteins in Pseudomonas aeruginosa laboratory and clinical strains in relation to pro-inflammatory effects.铜绿假单胞菌实验室和临床分离株中释放蛋白的 MudPIT 分析与促炎作用的关系。

Integr Biol (Camb). 2012 Mar;4(3):270-9. doi: 10.1039/c2ib00127f. Epub 2012 Feb 1.

Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis.靶向提取数据独立采集产生的 MS/MS 谱图：一致且准确的蛋白质组分析的新概念。

Mol Cell Proteomics. 2012 Jun;11(6):O111.016717. doi: 10.1074/mcp.O111.016717. Epub 2012 Jan 18.

Outcome prediction in pneumonia induced ALI/ARDS by clinical features and peptide patterns of BALF determined by mass spectrometry.通过临床特征和质谱法测定的 BALF 肽图谱预测肺炎所致 ALI/ARDS 的转归。

PLoS One. 2011;6(10):e25544. doi: 10.1371/journal.pone.0025544. Epub 2011 Oct 3.

A comparison of methods for classifying clinical samples based on proteomics data: a case study for statistical and machine learning approaches.基于蛋白质组学数据的临床样本分类方法比较：统计和机器学习方法的案例研究。

PLoS One. 2011;6(9):e24973. doi: 10.1371/journal.pone.0024973. Epub 2011 Sep 28.

Reliable typing of systemic amyloidoses through proteomic analysis of subcutaneous adipose tissue.通过对皮下脂肪组织的蛋白质组学分析实现系统性淀粉样变的可靠分型。

Blood. 2012 Feb 23;119(8):1844-7. doi: 10.1182/blood-2011-07-365510. Epub 2011 Sep 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于生物样本分类的多维蛋白质鉴定技术（MudPIT）数据的可用性。

Availability of MudPIT data for classification of biological samples.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献