Suppr超能文献

基质辅助激光解吸电离飞行时间质谱中的峰强度预测:一项支持定量蛋白质组学的机器学习研究。

Peak intensity prediction in MALDI-TOF mass spectrometry: a machine learning study to support quantitative proteomics.

作者信息

Timm Wiebke, Scherbart Alexandra, Böcker Sebastian, Kohlbacher Oliver, Nattkemper Tim W

机构信息

Applied Neuroinformatics Group, Bielefeld University, Germany.

出版信息

BMC Bioinformatics. 2008 Oct 20;9:443. doi: 10.1186/1471-2105-9-443.

Abstract

BACKGROUND

Mass spectrometry is a key technique in proteomics and can be used to analyze complex samples quickly. One key problem with the mass spectrometric analysis of peptides and proteins, however, is the fact that absolute quantification is severely hampered by the unclear relationship between the observed peak intensity and the peptide concentration in the sample. While there are numerous approaches to circumvent this problem experimentally (e.g. labeling techniques), reliable prediction of the peak intensities from peptide sequences could provide a peptide-specific correction factor. Thus, it would be a valuable tool towards label-free absolute quantification.

RESULTS

In this work we present machine learning techniques for peak intensity prediction for MALDI mass spectra. Features encoding the peptides' physico-chemical properties as well as string-based features were extracted. A feature subset was obtained from multiple forward feature selections on the extracted features. Based on these features, two advanced machine learning methods (support vector regression and local linear maps) are shown to yield good results for this problem (Pearson correlation of 0.68 in a ten-fold cross validation).

CONCLUSION

The techniques presented here are a useful first step going beyond the binary prediction of proteotypic peptides towards a more quantitative prediction of peak intensities. These predictions in turn will turn out to be beneficial for mass spectrometry-based quantitative proteomics.

摘要

背景

质谱分析是蛋白质组学中的一项关键技术,可用于快速分析复杂样本。然而,肽和蛋白质的质谱分析存在一个关键问题,即由于样本中观察到的峰强度与肽浓度之间的关系不明确,绝对定量受到严重阻碍。虽然有许多实验方法可以规避这个问题(例如标记技术),但从肽序列可靠预测峰强度可以提供一个肽特异性校正因子。因此,它将成为无标记绝对定量的一个有价值的工具。

结果

在这项工作中,我们展示了用于预测基质辅助激光解吸电离质谱峰强度的机器学习技术。提取了编码肽的物理化学性质的特征以及基于字符串的特征。通过对提取的特征进行多次前向特征选择获得了一个特征子集。基于这些特征,两种先进的机器学习方法(支持向量回归和局部线性映射)在这个问题上取得了良好的结果(十折交叉验证中的皮尔逊相关系数为0.68)。

结论

本文提出的技术是超越对蛋白型肽的二元预测,迈向对峰强度进行更定量预测的有用的第一步。这些预测反过来将被证明对基于质谱的定量蛋白质组学有益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ebf/2600826/4a1c9f794bf3/1471-2105-9-443-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验