Stoyanova Radka, Nicholls Andrew W, Nicholson Jeremy K, Lindon John C, Brown Truman R
Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA.
J Magn Reson. 2004 Oct;170(2):329-35. doi: 10.1016/j.jmr.2004.07.009.
Pattern recognition techniques are effective tools for reducing the information contained in large spectral data sets to a much smaller number of significant features which can then be used to make interpretations about the chemical or biochemical system under study. Often the effectiveness of such approaches is impeded by experimental and instrument induced variations in the position, phase, and line width of the spectral peaks. Although characterizing the cause and magnitude of these fluctuations could be important in its own right (pH-induced NMR chemical shift changes, for example) in general they obscure the process of pattern discovery. One major area of application is the use of large databases of (1)H NMR spectra of biofluids such as urine for investigating perturbations in metabolic profiles caused by drugs or disease, a process now termed metabonomics. Frequency shifts of individual peaks are the dominant source of such unwanted variations in this type of data. In this paper, an automatic procedure for aligning the individual peaks in the data set is described and evaluated. The proposed method will be vital for the efficient and automatic analysis of large metabonomic data sets and should also be applicable to other types of data.
模式识别技术是将大型光谱数据集中包含的信息减少到数量少得多的重要特征的有效工具,这些特征随后可用于对所研究的化学或生化系统进行解释。通常,此类方法的有效性会受到光谱峰的位置、相位和线宽的实验和仪器诱导变化的阻碍。尽管表征这些波动的原因和幅度本身可能很重要(例如pH诱导的核磁共振化学位移变化),但总体而言,它们会掩盖模式发现的过程。一个主要应用领域是使用生物流体(如尿液)的氢核磁共振光谱大型数据库来研究药物或疾病引起的代谢谱扰动,这一过程现在称为代谢组学。在这类数据中,单个峰的频移是此类不必要变化的主要来源。本文描述并评估了一种用于对齐数据集中单个峰的自动程序。所提出的方法对于大型代谢组学数据集的高效自动分析至关重要,并且也应适用于其他类型的数据。