Suppr超能文献

使用模糊规则构建专家系统,从基质辅助激光解吸/电离质谱的年龄分期小鼠小脑组织中进行自助分类和基于点的特征选择。

Bootstrap classification and point-based feature selection from age-staged mouse cerebellum tissues of matrix assisted laser desorption/ionization mass spectra using a fuzzy rule-building expert system.

作者信息

Harrington Peter B, Laurent Claudine, Levinson Douglas F, Levitt Pat, Markey Sanford P

机构信息

OhIO University Center for Intelligent Chemical Instrumentation, Department of Chemistry & Biochemistry, Clippinger Laboratories, Athens, OH 45701-2979, USA.

出版信息

Anal Chim Acta. 2007 Sep 19;599(2):219-31. doi: 10.1016/j.aca.2007.08.007. Epub 2007 Aug 6.

Abstract

A bootstrap method for point-based detection of candidate biomarker peaks has been developed from pattern classifiers. Point-based detection methods are advantageous in comparison to peak-based methods. Peak determination and selection are problematic when spectral peaks are not baseline resolved or on a varying baseline. The benefit of point-based detection is that peaks can be globally determined from the characteristic features of the entire data set (i.e., subsets of candidate points) as opposed to the traditional method of selecting peaks from individual spectra and then combining the peak list into a data set. The point-based method is demonstrated to be more effective and efficient using a synthetic data set when compared to using Mahalanobis distance for feature selection. In addition, probabilities that characterize the uniqueness of the peaks are determined. This method was applied for detecting peaks that characterize age-specific patterns of protein expression of developing and adult mouse cerebella from matrix assisted laser desorption/ionization (MALDI) mass spectrometry (MS) data. The mice comprised three age groups: 42 adults, 19 14-day-old pups, and 16 7-day-old pups. Three sequential spectra were obtained from each tissue section to yield 126, 57 and 48 spectra for adult, 14-day-old pup, and 7-day-old pup spectra, respectively. Each spectrum comprised 71,879 mass measurements in a range of 3.5-50 kDa. A previous study revealed that 846 unique peaks were detected that were consistent for 50% of the mice in each age group (C. Laurent, D.F. Levinson, S.A. Schwartz, P.B. Harrington, S.P. Markey, R.M. Caprioli, P. Levitt, Direct profiling of the cerebellum by MALDI MS: a methodological study in postnatal and adult mouse, J. Neurosci. Res. 81 (2005) 613-621.). A fuzzy rule-building expert system (FuRES) was applied to investigate the correlation of age with features in the MS data. FuRES detected two outlier pup-14 spectra. Prediction was evaluated using 100 bootstrap samples of 2 Latin-partitions (i.e., 50:50 split between training and prediction set) of the mice. The spectra without the outliers yielded classification rates of 99.1+/-0.1%, 90.1+/-0.8%, and 97.0+/-0.6% for adults, 14-day-old pups, and 7-day-old pups, respectively. At a 95% level of significance, 100 bootstrap samples disclosed 35 adult and 21 pup distinguishing peaks for separating adults from pups; and 8 14-day-old and 15 7-day-old predictive peaks for separating 14-day-old pup from 7-day-old pup spectra. A compressed matrix comprising 40,393 points that were outside the 95% confidence intervals of one of the two FuRES discriminants was evaluated and the classification improved significantly for all classes. When peaks that satisfied a quality criterion were integrated, the 55 integrated peak areas furnished significantly improved classification for all classes: the selected peak areas furnished classification rates of 100%, 97.3+/-0.6%, and 97.4+/-0.3% for adult, 14-day-old pups, and 7-day-old pups using 100 bootstrap Latin partitions evaluations with the predictions averaged. When the bootstrap size was increased to 1000 samples, the results were not significantly affected. The FuRES predictions were consistent with those obtained by discriminant partial least squares (DPLS) classifications.

摘要

一种基于模式分类器开发的用于基于点的候选生物标志物峰检测的自助法。与基于峰的方法相比,基于点的检测方法具有优势。当光谱峰未从基线解析或处于变化的基线上时,峰的确定和选择存在问题。基于点的检测的好处是,可以从整个数据集的特征(即候选点的子集)全局确定峰,这与从单个光谱中选择峰然后将峰列表组合成一个数据集的传统方法相反。与使用马氏距离进行特征选择相比,使用合成数据集证明基于点的方法更有效且高效。此外,还确定了表征峰独特性的概率。该方法应用于从基质辅助激光解吸/电离(MALDI)质谱(MS)数据中检测表征发育中和成年小鼠小脑蛋白质表达的年龄特异性模式的峰。小鼠分为三个年龄组:42只成年小鼠、19只14日龄幼鼠和16只7日龄幼鼠。从每个组织切片获得三个连续光谱,分别为成年小鼠、14日龄幼鼠和7日龄幼鼠光谱产生126、57和48个光谱。每个光谱在3.5-50 kDa范围内包含71,879个质量测量值。先前的一项研究表明,检测到846个独特的峰,每个年龄组中50%的小鼠的峰是一致的(C. Laurent,D.F. Levinson,S.A. Schwartz,P.B. Harrington,S.P. Markey,R.M. Caprioli,P. Levitt,通过MALDI MS对小脑进行直接分析:出生后和成年小鼠的方法学研究,《神经科学研究杂志》81(2005)613-621)。应用模糊规则构建专家系统(FuRES)来研究年龄与MS数据中的特征之间的相关性。FuRES检测到两个异常的14日龄幼鼠光谱。使用小鼠的2个拉丁分区(即训练集和预测集之间50:50分割)的100个自助样本评估预测。去除异常值后的光谱对成年小鼠、14日龄幼鼠和7日龄幼鼠的分类率分别为99.1±0.1%、90.1±0.8%和97.0±0.6%。在95%的显著性水平下,100个自助样本揭示了35个成年小鼠和21个幼鼠区分峰,用于将成年小鼠与幼鼠分开;以及8个14日龄幼鼠和15个7日龄幼鼠预测峰,用于将14日龄幼鼠与7日龄幼鼠光谱分开。评估了一个包含40,393个点的压缩矩阵,这些点在两个FuRES判别式之一的95%置信区间之外,所有类别的分类都有显著改善。当整合满足质量标准的峰时,55个整合峰面积为所有类别提供了显著改善的分类:使用100个自助拉丁分区评估并平均预测结果时,所选峰面积对成年小鼠、14日龄幼鼠和7日龄幼鼠的分类率分别为100%、97.3±0.6%和97.4±0.3%。当自助样本大小增加到1000个样本时,结果没有受到显著影响。FuRES预测与通过判别偏最小二乘法(DPLS)分类获得的预测一致。

相似文献

本文引用的文献

2
Tissue profiling by mass spectrometry: a review of methodology and applications.基于质谱的组织分析:方法与应用综述
Mol Cell Proteomics. 2005 Apr;4(4):394-401. doi: 10.1074/mcp.R500006-MCP200. Epub 2005 Jan 26.
9
Evaluation of neural network models with generalized sensitivity analysis.
Anal Chem. 2000 Oct 15;72(20):5004-13. doi: 10.1021/ac0004963.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验