Suppr超能文献

用于质谱分析的基于模型的、与平台无关的改进型特征提取方法。

Improved model-based, platform-independent feature extraction for mass spectrometry.

作者信息

Noy Karin, Fasulo Daniel

机构信息

Integrated Data System Department, Siemens Corporate Research, 755 College Road East, Princeton, NJ 08540, USA.

出版信息

Bioinformatics. 2007 Oct 1;23(19):2528-35. doi: 10.1093/bioinformatics/btm385. Epub 2007 Aug 13.

Abstract

MOTIVATION

Mass spectrometry (MS) is increasingly being used for biomedical research. The typical analysis of MS data consists of several steps. Feature extraction is a crucial step since subsequent analyses are performed only on the detected features. Current methodologies applied to low-resolution MS, in which features are peaks or wavelet functions, are parameter-sensitive and inaccurate in the sense that peaks and wavelet functions do not directly correspond to the underlying molecules under observation. In high-resolution MS, the model-based approach is more appealing as it can provide a better representation of the MS signals by incorporating information about peak shapes and isotopic distributions. Current model-based techniques are computationally expensive; various algorithms have been proposed to improve the computational efficiency of this paradigm. However, these methods cannot deal well with overlapping features, especially when they are merged to create one broad peak. In addition, no method has been proven to perform well across different MS platforms.

RESULTS

We suggest a new model-based approach to feature extraction in which spectra are decomposed into a mixture of distributions derived from peptide models. By incorporating kernel-based smoothing and perceptual similarity for matching distributions, our statistical framework improves existing methodologies in terms of computational efficiency and the accuracy of the results. Our model is parameterized by physical properties and is therefore applicable to different MS instruments and settings. We validate our approach on simulated data, and show that the performance is higher than commonly used tools on real high- and low-resolution MS, and MS/MS data sets.

摘要

动机

质谱(MS)越来越多地用于生物医学研究。质谱数据分析的典型流程包括几个步骤。特征提取是关键步骤,因为后续分析仅针对检测到的特征进行。应用于低分辨率质谱的当前方法中,特征是峰或小波函数,这些方法对参数敏感且不准确,因为峰和小波函数并不直接对应于所观察的潜在分子。在高分辨率质谱中,基于模型的方法更具吸引力,因为它可以通过纳入有关峰形和同位素分布的信息来更好地表示质谱信号。当前基于模型的技术计算成本高昂;已提出各种算法来提高该范式的计算效率。然而,这些方法不能很好地处理重叠特征,尤其是当它们合并形成一个宽峰时。此外,尚无方法被证明在不同的质谱平台上都能表现良好。

结果

我们提出了一种新的基于模型的特征提取方法,其中光谱被分解为源自肽模型的分布混合物。通过纳入基于核的平滑和用于匹配分布的感知相似性,我们的统计框架在计算效率和结果准确性方面改进了现有方法。我们的模型由物理性质参数化,因此适用于不同的质谱仪器和设置。我们在模拟数据上验证了我们的方法,并表明其性能高于真实高分辨率和低分辨率质谱以及串联质谱数据集上常用的工具。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验