Chierici Marco, Albanese Davide, Franceschi Pietro, Furlanello Cesare
Fondazione Bruno Kessler, Trento, Italy.
Mol Biosyst. 2012 Nov;8(11):2845-9. doi: 10.1039/c2mb25223f. Epub 2012 Aug 9.
Many are the sources of variability that can affect reproducibility of disease biomarkers from time-of-flight (TOF) Mass Spectrometry (MS) data. Here we present TOFwave, a complete software pipeline for TOF-MS biomarker identification, that limits the impact of parameter tuning along the whole chain of preprocessing and model selection modules. Peak profiles are obtained by a preprocessing based on Continuous Wavelet Transform (CWT), coupled with a machine learning protocol aimed at avoiding selection bias effects. Only two parameters (minimum peak width and a signal to noise cutoff) have to be explicitly set. The TOFwave pipeline is built on top of the mlpy Python package. Examples on Matrix-Assisted Laser Desorption and Ionization (MALDI) TOF datasets are presented. Software prototype, datasets and details to replicate results in this paper can be found at http://mlpy.sf.net/tofwave/.
有许多可变性来源会影响飞行时间(TOF)质谱(MS)数据中疾病生物标志物的可重复性。在此,我们展示了TOFwave,这是一个用于TOF-MS生物标志物识别的完整软件流程,它限制了参数调整在整个预处理和模型选择模块链中的影响。通过基于连续小波变换(CWT)的预处理获得峰轮廓,并结合旨在避免选择偏差效应的机器学习协议。只需明确设置两个参数(最小峰宽和信噪比截止值)。TOFwave流程基于mlpy Python包构建。文中给出了基质辅助激光解吸电离(MALDI)TOF数据集的示例。本文软件原型、数据集以及复制结果的详细信息可在http://mlpy.sf.net/tofwave/ 找到。