用于质谱分析的基于模型的、与平台无关的改进型特征提取方法。

Improved model-based, platform-independent feature extraction for mass spectrometry.

作者信息

Noy Karin, Fasulo Daniel

机构信息

Integrated Data System Department, Siemens Corporate Research, 755 College Road East, Princeton, NJ 08540, USA.

出版信息

Bioinformatics. 2007 Oct 1;23(19):2528-35. doi: 10.1093/bioinformatics/btm385. Epub 2007 Aug 13.

DOI:10.1093/bioinformatics/btm385

PMID:17698491

Abstract

MOTIVATION

Mass spectrometry (MS) is increasingly being used for biomedical research. The typical analysis of MS data consists of several steps. Feature extraction is a crucial step since subsequent analyses are performed only on the detected features. Current methodologies applied to low-resolution MS, in which features are peaks or wavelet functions, are parameter-sensitive and inaccurate in the sense that peaks and wavelet functions do not directly correspond to the underlying molecules under observation. In high-resolution MS, the model-based approach is more appealing as it can provide a better representation of the MS signals by incorporating information about peak shapes and isotopic distributions. Current model-based techniques are computationally expensive; various algorithms have been proposed to improve the computational efficiency of this paradigm. However, these methods cannot deal well with overlapping features, especially when they are merged to create one broad peak. In addition, no method has been proven to perform well across different MS platforms.

RESULTS

We suggest a new model-based approach to feature extraction in which spectra are decomposed into a mixture of distributions derived from peptide models. By incorporating kernel-based smoothing and perceptual similarity for matching distributions, our statistical framework improves existing methodologies in terms of computational efficiency and the accuracy of the results. Our model is parameterized by physical properties and is therefore applicable to different MS instruments and settings. We validate our approach on simulated data, and show that the performance is higher than commonly used tools on real high- and low-resolution MS, and MS/MS data sets.

摘要

动机

质谱（MS）越来越多地用于生物医学研究。质谱数据分析的典型流程包括几个步骤。特征提取是关键步骤，因为后续分析仅针对检测到的特征进行。应用于低分辨率质谱的当前方法中，特征是峰或小波函数，这些方法对参数敏感且不准确，因为峰和小波函数并不直接对应于所观察的潜在分子。在高分辨率质谱中，基于模型的方法更具吸引力，因为它可以通过纳入有关峰形和同位素分布的信息来更好地表示质谱信号。当前基于模型的技术计算成本高昂；已提出各种算法来提高该范式的计算效率。然而，这些方法不能很好地处理重叠特征，尤其是当它们合并形成一个宽峰时。此外，尚无方法被证明在不同的质谱平台上都能表现良好。

结果

我们提出了一种新的基于模型的特征提取方法，其中光谱被分解为源自肽模型的分布混合物。通过纳入基于核的平滑和用于匹配分布的感知相似性，我们的统计框架在计算效率和结果准确性方面改进了现有方法。我们的模型由物理性质参数化，因此适用于不同的质谱仪器和设置。我们在模拟数据上验证了我们的方法，并表明其性能高于真实高分辨率和低分辨率质谱以及串联质谱数据集上常用的工具。

相似文献

Improved model-based, platform-independent feature extraction for mass spectrometry.用于质谱分析的基于模型的、与平台无关的改进型特征提取方法。

Bioinformatics. 2007 Oct 1;23(19):2528-35. doi: 10.1093/bioinformatics/btm385. Epub 2007 Aug 13.

Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching.通过结合基于连续小波变换的模式匹配改进质谱中的峰检测。

Bioinformatics. 2006 Sep 1;22(17):2059-65. doi: 10.1093/bioinformatics/btl355. Epub 2006 Jul 4.

Guilt-by-association feature selection: identifying biomarkers from proteomic profiles.基于关联的特征选择：从蛋白质组学图谱中识别生物标志物。

J Biomed Inform. 2008 Feb;41(1):124-36. doi: 10.1016/j.jbi.2007.04.003. Epub 2007 Apr 14.

Independent component analysis for the extraction of reliable protein signal profiles from MALDI-TOF mass spectra.用于从基质辅助激光解吸电离飞行时间质谱中提取可靠蛋白质信号图谱的独立成分分析。

Bioinformatics. 2008 Jan 1;24(1):63-70. doi: 10.1093/bioinformatics/btm533. Epub 2007 Nov 14.

Peak bagging for peptide mass fingerprinting.用于肽质量指纹图谱的峰提取

Bioinformatics. 2008 May 15;24(10):1293-9. doi: 10.1093/bioinformatics/btn123. Epub 2008 Apr 7.

Proteomic mass spectra classification using decision tree based ensemble methods.使用基于决策树的集成方法进行蛋白质组质谱分类。

Bioinformatics. 2005 Jul 15;21(14):3138-45. doi: 10.1093/bioinformatics/bti494. Epub 2005 May 12.

Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.高通量蛋白质组学流程中二维凝胶电泳的自动图像对齐

Bioinformatics. 2008 Apr 1;24(7):950-7. doi: 10.1093/bioinformatics/btn059. Epub 2008 Feb 28.

Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models.使用基于小波的功能混合模型对质谱蛋白质组学数据进行贝叶斯分析。

Biometrics. 2008 Jun;64(2):479-89. doi: 10.1111/j.1541-0420.2007.00895.x. Epub 2007 Sep 20.

Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.概率多类多核学习：用于蛋白质折叠识别和远程同源性检测

Bioinformatics. 2008 May 15;24(10):1264-70. doi: 10.1093/bioinformatics/btn112. Epub 2008 Mar 31.

A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics.一种用于预测精确质量和时间蛋白质组学中蛋白型肽段的支持向量机模型。

Bioinformatics. 2008 Jul 1;24(13):1503-9. doi: 10.1093/bioinformatics/btn218. Epub 2008 May 3.

引用本文的文献

Joint Bounding of Peaks Across Samples Improves Differential Analysis in Mass Spectrometry-Based Metabolomics.跨样本峰联合约束可提高基于质谱的代谢组学中的差异分析。

Anal Chem. 2017 Mar 21;89(6):3517-3523. doi: 10.1021/acs.analchem.6b04719. Epub 2017 Mar 7.

Signal Partitioning Algorithm for Highly Efficient Gaussian Mixture Modeling in Mass Spectrometry.质谱中高效高斯混合建模的信号划分算法

PLoS One. 2015 Jul 31;10(7):e0134256. doi: 10.1371/journal.pone.0134256. eCollection 2015.

Isotope pattern deconvolution for peptide mass spectrometry by non-negative least squares/least absolute deviation template matching.非负最小二乘/最小绝对值偏差模板匹配法用于肽质谱的同位素峰分解。

BMC Bioinformatics. 2012 Nov 8;13:291. doi: 10.1186/1471-2105-13-291.

Features-based deisotoping method for tandem mass spectra.基于特征的串联质谱去同位素方法。

Adv Bioinformatics. 2011;2011:210805. doi: 10.1155/2011/210805. Epub 2012 Jan 4.

Accurate peak list extraction from proteomic mass spectra for identification and profiling studies.从蛋白质组学质谱中准确提取肽段信息以进行鉴定和分析研究。

BMC Bioinformatics. 2010 Oct 16;11:518. doi: 10.1186/1471-2105-11-518.

BPDA - a Bayesian peptide detection algorithm for mass spectrometry.BPDA - 一种用于质谱的贝叶斯肽检测算法。

BMC Bioinformatics. 2010 Sep 29;11:490. doi: 10.1186/1471-2105-11-490.

A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection.基于尺度空间的无监督特征选择在卵巢癌检测中用于质谱分类。

BMC Bioinformatics. 2009 Oct 15;10 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2105-10-S12-S9.

Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model.使用混合模型的可逆跳跃马尔可夫链蒙特卡罗方法用于中风表面增强激光解吸电离飞行时间质谱的峰识别

Bioinformatics. 2008 Jul 1;24(13):i407-13. doi: 10.1093/bioinformatics/btn143.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于质谱分析的基于模型的、与平台无关的改进型特征提取方法。

Improved model-based, platform-independent feature extraction for mass spectrometry.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献