Sugimoto Masahiro, Kawakami Masato, Robert Martin, Soga Tomoyoshi, Tomita Masaru
Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata 997-0017, Japan.
Curr Bioinform. 2012 Mar;7(1):96-108. doi: 10.2174/157489312799304431.
Biological systems are increasingly being studied in a holistic manner, using omics approaches, to provide quantitative and qualitative descriptions of the diverse collection of cellular components. Among the omics approaches, metabolomics, which deals with the quantitative global profiling of small molecules or metabolites, is being used extensively to explore the dynamic response of living systems, such as organelles, cells, tissues, organs and whole organisms, under diverse physiological and pathological conditions. This technology is now used routinely in a number of applications, including basic and clinical research, agriculture, microbiology, food science, nutrition, pharmaceutical research, environmental science and the development of biofuels. Of the multiple analytical platforms available to perform such analyses, nuclear magnetic resonance and mass spectrometry have come to dominate, owing to the high resolution and large datasets that can be generated with these techniques. The large multidimensional datasets that result from such studies must be processed and analyzed to render this data meaningful. Thus, bioinformatics tools are essential for the efficient processing of huge datasets, the characterization of the detected signals, and to align multiple datasets and their features. This paper provides a state-of-the-art overview of the data processing tools available, and reviews a collection of recent reports on the topic. Data conversion, pre-processing, alignment, normalization and statistical analysis are introduced, with their advantages and disadvantages, and comparisons are made to guide the reader.
生物系统越来越多地以整体方式进行研究,采用组学方法,以对细胞成分的多样集合进行定量和定性描述。在组学方法中,代谢组学涉及小分子或代谢物的定量全局分析,正被广泛用于探索细胞器、细胞、组织、器官和整个生物体等生命系统在各种生理和病理条件下的动态反应。该技术目前在许多应用中常规使用,包括基础和临床研究、农业、微生物学、食品科学、营养、药物研究、环境科学以及生物燃料开发。在可用于进行此类分析的多个分析平台中,核磁共振和质谱由于能够通过这些技术生成高分辨率和大量数据集而占据主导地位。此类研究产生的大型多维数据集必须进行处理和分析,以使这些数据有意义。因此,生物信息学工具对于高效处理海量数据集、表征检测到的信号以及对齐多个数据集及其特征至关重要。本文提供了可用数据处理工具的最新概述,并回顾了有关该主题的一系列近期报告。介绍了数据转换、预处理、对齐、归一化和统计分析及其优缺点,并进行了比较以指导读者。