Department of Electrical and Computer Engineering,Virginia Tech, Washington, DC 20057, USA
IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):494-503. doi: 10.1109/TCBB.2013.25.
A Bayesian alignment model (BAM) is proposed for alignment of liquid chromatography-mass spectrometry (LC-MS) data. BAM belongs to the category of profile-based approaches, which are composed of two major components: a prototype function and a set of mapping functions. Appropriate estimation of these functions is crucial for good alignment results. BAM uses Markov chain Monte Carlo (MCMC) methods to draw inference on the model parameters and improves on existing MCMC-based alignment methods through 1) the implementation of an efficient MCMC sampler and 2) an adaptive selection of knots. A block Metropolis-Hastings algorithm that mitigates the problem of the MCMC sampler getting stuck at local modes of the posterior distribution is used for the update of the mapping function coefficients. In addition, a stochastic search variable selection (SSVS) methodology is used to determine the number and positions of knots. We applied BAM to a simulated data set, an LC-MS proteomic data set, and two LC-MS metabolomic data sets, and compared its performance with the Bayesian hierarchical curve registration (BHCR) model, the dynamic time-warping (DTW) model, and the continuous profile model (CPM). The advantage of applying appropriate profile-based retention time correction prior to performing a feature-based approach is also demonstrated through the metabolomic data sets.
提出了一种用于液相色谱-质谱 (LC-MS) 数据对齐的贝叶斯对齐模型 (BAM)。BAM 属于基于轮廓的方法类别,由两个主要组件组成:原型函数和一组映射函数。这些函数的适当估计对于良好的对齐结果至关重要。BAM 使用马尔可夫链蒙特卡罗 (MCMC) 方法对模型参数进行推断,并通过以下方式改进现有的基于 MCMC 的对齐方法:1)实现有效的 MCMC 采样器;2)自适应选择节点。使用块 Metropolis-Hastings 算法来更新映射函数系数,该算法缓解了 MCMC 采样器卡在后验分布局部模式的问题。此外,使用随机搜索变量选择 (SSVS) 方法来确定节点的数量和位置。我们将 BAM 应用于模拟数据集、LC-MS 蛋白质组学数据集和两个 LC-MS 代谢组学数据集,并将其性能与贝叶斯层次曲线注册 (BHCR) 模型、动态时间扭曲 (DTW) 模型和连续轮廓模型 (CPM) 进行了比较。通过代谢组学数据集还证明了在执行基于特征的方法之前应用适当的基于轮廓的保留时间校正的优势。