Wang Yuan, Zhou Xiaobo, Wang Honghui, Li King, Yao Lixiu, Wong Stephen T C
Center for Biotechnology and Informatics, The Methodist Hospital Research Institute, and Department of Radiology, The Methodist Hospital, Weill Cornell Medical College, Houston, TX 77030, USA.
Bioinformatics. 2008 Jul 1;24(13):i407-13. doi: 10.1093/bioinformatics/btn143.
Mass spectrometry (MS) has shown great potential in detecting disease-related biomarkers for early diagnosis of stroke. To discover potential biomarkers from large volume of noisy MS data, peak detection must be performed first. This article proposes a novel automatic peak detection method for the stroke MS data. In this method, a mixture model is proposed to model the spectrum. Bayesian approach is used to estimate parameters of the mixture model, and Markov chain Monte Carlo method is employed to perform Bayesian inference. By introducing a reversible jump method, we can automatically estimate the number of peaks in the model. Instead of separating peak detection into substeps, the proposed peak detection method can do baseline correction, denoising and peak identification simultaneously. Therefore, it minimizes the risk of introducing irrecoverable bias and errors from each substep. In addition, this peak detection method does not require a manually selected denoising threshold. Experimental results on both simulated dataset and stroke MS dataset show that the proposed peak detection method not only has the ability to detect small signal-to-noise ratio peaks, but also greatly reduces false detection rate while maintaining the same sensitivity.
质谱(MS)在检测与疾病相关的生物标志物以用于中风的早期诊断方面已显示出巨大潜力。为了从大量嘈杂的MS数据中发现潜在的生物标志物,必须首先进行峰检测。本文提出了一种针对中风MS数据的新型自动峰检测方法。在该方法中,提出了一种混合模型来对光谱进行建模。采用贝叶斯方法估计混合模型的参数,并使用马尔可夫链蒙特卡罗方法进行贝叶斯推断。通过引入可逆跳跃方法,我们可以自动估计模型中的峰数量。所提出的峰检测方法不是将峰检测分为子步骤,而是可以同时进行基线校正、去噪和峰识别。因此,它将每个子步骤引入不可恢复偏差和误差的风险降至最低。此外,这种峰检测方法不需要手动选择去噪阈值。在模拟数据集和中风MS数据集上的实验结果表明,所提出的峰检测方法不仅具有检测小信噪比峰的能力,而且在保持相同灵敏度的同时大大降低了误检率。