Koslovsky Matthew D
Department of Statistics, Colorado State University, Fort Collins, CO, USA.
BMC Bioinformatics. 2025 Feb 27;26(1):69. doi: 10.1186/s12859-025-06078-4.
The human microbiome is the collection of microorganisms living on and inside of our bodies. A major aim of microbiome research is understanding the role microbial communities play in human health with the goal of designing personalized interventions that modulate the microbiome to treat or prevent disease. Microbiome data are challenging to analyze due to their high-dimensionality, overdispersion, and zero-inflation. Analysis is further complicated by the steps taken to collect and process microbiome samples. For example, sequencing instruments have a fixed capacity for the total number of reads delivered. It is therefore essential to treat microbial samples as compositional. Another complicating factor of modeling microbiome data is that taxa counts are subject to measurement error introduced at various stages of the measurement protocol. Advances in sequencing technology and preprocessing pipelines coupled with our growing knowledge of the human microbiome have reduced, but not eliminated, measurement error. Ignoring measurement error during analysis, though common in practice, can then lead to biased inference and curb reproducibility. We propose a Dirichlet-multinomial modeling framework for microbiome data with excess zeros and potential taxonomic misclassification. We demonstrate how accommodating taxonomic misclassification improves estimation performance and investigate differences in gut microbial composition between healthy and obese children.
人类微生物组是生活在我们体内和体表的微生物集合。微生物组研究的一个主要目标是了解微生物群落对人类健康的作用,旨在设计个性化干预措施,调节微生物组以治疗或预防疾病。由于微生物组数据具有高维度、过度分散和零膨胀的特点,对其进行分析具有挑战性。收集和处理微生物组样本的步骤进一步使分析变得复杂。例如,测序仪器对输出的读取总数有固定的容量。因此,将微生物样本视为成分数据至关重要。对微生物组数据进行建模的另一个复杂因素是,分类群计数会受到测量协议各个阶段引入的测量误差的影响。测序技术和预处理流程的进步,以及我们对人类微生物组认识的不断增加,虽然减少但并未消除测量误差。在分析过程中忽略测量误差,尽管在实践中很常见,但可能会导致有偏差的推断并影响可重复性。我们为存在过多零值和潜在分类错误的微生物组数据提出了一个狄利克雷多项分布建模框架。我们展示了如何考虑分类错误来提高估计性能,并研究健康儿童和肥胖儿童肠道微生物组成的差异。