Zhou Ruwen, Ng Siu Kin, Sung Joseph Jao Yiu, Goh Wilson Wen Bin, Wong Sunny Hei
Lee Kong Chian School of Medicine, Nanyang Technological University, 11 Mandalay Road, 308232, Singapore.
Department of Gastroenterology and Hepatology, Tan Tock Seng Hospital, National Healthcare Group, 11 Jalan Tan Tock Seng, 308433, Singapore.
Comput Struct Biotechnol J. 2023 Oct 4;21:4804-4815. doi: 10.1016/j.csbj.2023.10.001. eCollection 2023.
The human microbiome is an emerging research frontier due to its profound impacts on health. High-throughput microbiome sequencing enables studying microbial communities but suffers from analytical challenges. In particular, the lack of dedicated preprocessing methods to improve data quality impedes effective minimization of biases prior to downstream analysis. This review aims to address this gap by providing a comprehensive overview of preprocessing techniques relevant to microbiome research. We outline a typical workflow for microbiome data analysis. Preprocessing methods discussed include quality filtering, batch effect correction, imputation of missing values, normalization, and data transformation. We highlight strengths and limitations of each technique to serve as a practical guide for researchers and identify areas needing further methodological development. Establishing robust, standardized preprocessing will be essential for drawing valid biological conclusions from microbiome studies.
由于人类微生物组对健康具有深远影响,它已成为一个新兴的研究前沿领域。高通量微生物组测序能够对微生物群落进行研究,但面临分析方面的挑战。特别是,缺乏专门用于提高数据质量的预处理方法阻碍了在下游分析之前有效减少偏差。本综述旨在通过全面概述与微生物组研究相关的预处理技术来填补这一空白。我们概述了微生物组数据分析的典型工作流程。所讨论的预处理方法包括质量过滤、批次效应校正、缺失值插补、归一化和数据转换。我们强调了每种技术的优点和局限性,以为研究人员提供实用指南,并确定需要进一步方法开发的领域。建立强大、标准化的预处理对于从微生物组研究中得出有效的生物学结论至关重要。
Comput Struct Biotechnol J. 2023-10-4
Front Microbiol. 2023-10-5
BMC Bioinformatics. 2022-2-7
Nanomaterials (Basel). 2020-5-8
BMC Bioinformatics. 2024-6-20
Methods Mol Biol. 2018
Genes Dis. 2023-7-7
Curr Oncol Rep. 2024-3
Imeta. 2025-6-22
Comput Struct Biotechnol J. 2025-5-10
Comput Struct Biotechnol J. 2024-11-8
J Pathol Inform. 2024-9-12
Nat Comput Sci. 2022-5
Drug Discov Today. 2023-9
Cell Host Microbe. 2023-4-12
Sci Rep. 2023-2-21
Brief Bioinform. 2023-3-19