Suppr超能文献

在生物样本的红外光谱数据判别分析中识别导致聚类的变量。

Identifying variables responsible for clustering in discriminant analysis of data from infrared microspectroscopy of a biological sample.

作者信息

Martin Francis L, German Matthew J, Wit Ernst, Fearn Thomas, Ragavan Narasimhan, Pollock Hubert M

机构信息

Biomedical Sciences Unit, Lancaster University, Lancaster, United Kingdom.

出版信息

J Comput Biol. 2007 Nov;14(9):1176-84. doi: 10.1089/cmb.2007.0057.

Abstract

In the biomedical field, infrared (IR) spectroscopic studies can involve the processing of data derived from many samples, divided into classes such as category of tissue (e.g., normal or cancerous) or patient identity. We require reliable methods to identify the class-specific information on which of the wavenumbers, representing various molecular groups, are responsible for observed class groupings. Employing a prostate tissue sample divided into three regions (transition zone, peripheral zone, and adjacent adenocarcinoma), and interrogated using synchrotron Fourier-transform IR microspectroscopy, we compared two statistical methods: (a) a new "cluster vector" version of principal component analysis (PCA) in which the dimensions of the dataset are reduced, followed by linear discriminant analysis (LDA) to reveal clusters, through each of which a vector is constructed that identifies the contributory wavenumbers; and (b) stepwise LDA, which exploits the fact that spectral peaks which identify certain chemical bonds extend over several wavenumbers, and which following classification via either one or two wavenumbers, checks whether the resulting predictions are stable across a range of nearby wavenumbers. Stepwise LDA is the simpler of the two methods; the cluster vector approach can indicate which of the different classes of spectra exhibit the significant differences in signal seen at the "prominent" wavenumbers identified. In situations where IR spectra are found to separate into classes, the excellent agreement between the two quite different methods points to what will prove to be a new and reliable approach to establishing which molecular groups are responsible for such separation.

摘要

在生物医学领域,红外(IR)光谱研究可能涉及对来自许多样本的数据进行处理,这些样本分为不同类别,如组织类别(例如正常或癌组织)或患者身份。我们需要可靠的方法来识别特定类别的信息,即代表各种分子基团的哪些波数导致了观察到的类别分组。我们使用一个分为三个区域(移行带、外周带和相邻腺癌)的前列腺组织样本,并通过同步加速器傅里叶变换红外显微光谱进行检测,比较了两种统计方法:(a)一种新的主成分分析(PCA)“聚类向量”版本,先对数据集进行降维,然后进行线性判别分析(LDA)以揭示聚类,通过每个聚类构建一个向量来识别有贡献的波数;(b)逐步线性判别分析,该方法利用了识别某些化学键的光谱峰延伸到几个波数的这一事实,在通过一个或两个波数进行分类后,检查所得预测在一系列附近波数范围内是否稳定。逐步线性判别分析是两种方法中较简单的一种;聚类向量方法可以指出在已识别的“突出”波数处,不同类别的光谱在信号上表现出显著差异的是哪一类。在发现红外光谱可分为不同类别的情况下,这两种截然不同的方法之间的出色一致性表明,这将被证明是一种新的、可靠的方法,用于确定哪些分子基团导致了这种分类。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验