Pelikan Richard C, Hauskrecht Milos
Departments of Biomedical Informatics.
AMIA Annu Symp Proc. 2010 Nov 13;2010:632-6.
Mass spectrometry proteomic profiling has potential to be a useful clinical screening tool. One obstacle is providing a standardized method for preprocessing the noisy raw data. We have developed a system for automatically determining a set of preprocessing methods among several candidates. Our system's automated nature relieves the analyst of the need to be knowledgeable about which methods to use on any given dataset. Each stage of preprocessing is approached with many competing methods. We introduce metrics which are used to balance each method's attempts to correct noise versus preserving valuable discriminative information. We demonstrate the benefit of our preprocessing system on several SELDI and MALDI mass spectrometry datasets. Downstream classification is improved when using our system to preprocess the data.
质谱蛋白质组分析有潜力成为一种有用的临床筛查工具。一个障碍是提供一种标准化方法来预处理有噪声的原始数据。我们开发了一个系统,用于在多个候选方法中自动确定一组预处理方法。我们系统的自动化特性使分析人员无需了解在任何给定数据集上应使用哪些方法。预处理的每个阶段都有许多相互竞争的方法。我们引入了一些指标,用于平衡每种方法在纠正噪声与保留有价值的判别信息方面的尝试。我们在几个表面增强激光解吸电离(SELDI)和基质辅助激光解吸电离(MALDI)质谱数据集上证明了我们预处理系统的优势。使用我们的系统对数据进行预处理时,下游分类得到了改善。