Oh Jung Hun, Gurnani Prem, Schorge John, Rosenblatt Kevin P, Gao Jean X
Department of Computer Science and Engineering, University of Texas, Arlington, TX 76019, USA.
IEEE Trans Inf Technol Biomed. 2009 Mar;13(2):195-206. doi: 10.1109/TITB.2008.2007909. Epub 2008 Dec 31.
High-resolution matrix-assisted laser desorption/ionization time-of-flight mass spectrometry has recently shown promise as a screening tool for detecting discriminatory peptide/protein patterns. The major computational obstacle in finding such patterns is the large number of mass/charge peaks (features, biomarkers, data points) in a spectrum. To tackle this problem, we have developed methods for data preprocessing and biomarker selection. The preprocessing consists of binning, baseline correction, and normalization. An algorithm, extended Markov blanket, is developed for biomarker detection, which combines redundant feature removal and discriminant feature selection. The biomarker selection couples with support vector machine to achieve sample prediction from high-resolution proteomic profiles. Our algorithm is applied to recurrent ovarian cancer study that contains platinum-sensitive and platinum-resistant samples after treatment. Experiments show that the proposed method performs better than other feature selection algorithms. In particular, our algorithm yields good performance in terms of both sensitivity and specificity as compared to other methods.
高分辨率基质辅助激光解吸/电离飞行时间质谱最近已显示出作为检测鉴别性肽/蛋白质模式的筛选工具的前景。寻找此类模式的主要计算障碍是光谱中大量的质荷比峰(特征、生物标志物、数据点)。为了解决这个问题,我们开发了数据预处理和生物标志物选择的方法。预处理包括分箱、基线校正和归一化。开发了一种扩展马尔可夫毯算法用于生物标志物检测,该算法结合了冗余特征去除和判别特征选择。生物标志物选择与支持向量机相结合,以从高分辨率蛋白质组学图谱实现样本预测。我们的算法应用于复发性卵巢癌研究,该研究包含治疗后的铂敏感和铂耐药样本。实验表明,所提出的方法比其他特征选择算法表现更好。特别是,与其他方法相比,我们的算法在敏感性和特异性方面都产生了良好的性能。