Rathod Dharmik, Silverman Justin D
College of Information Sciences and Technology, The Pennsylvania State University.
Department of Statistics, The Pennsylvania State University.
bioRxiv. 2025 Jul 31:2025.07.31.667904. doi: 10.1101/2025.07.31.667904.
Polymerase Chain Reaction (PCR) is a critical step in amplicon-based microbial community profiling, allowing the selective amplification of marker genes such as 16S rRNA from environmental or host-associated samples. Despite its widespread use, PCR is known to introduce amplification bias, where some DNA sequences are preferentially amplified over others due to factors such as primer-template mismatches, sequence GC content, and secondary structures. Although these biases are known to affect transcript abundance, their implications for ecological metrics remain poorly understood. In this study, we conduct a comprehensive evaluation of how PCR-bias influences both within-samples ( -diversity) and between-sample ( -diversity) analyses. We show that perturbation-invariant diversity measures remain unaffected by PCR bias, but widely used metrics such as Shannon diversity and Weighted-Unifrac are sensitive, with their values varying according to the true community composition. To address this, we provide theoretical and empirical insight into how PCR-induced bias varies across ecological analyses and community structures, and we offer practical guidance on when bias-correction methods should be applied. Our findings highlight the importance of selecting appropriate diversity metrics for PCR-based microbial ecology workflows and offer guidance for improving the reliability of diversity analyses.
聚合酶链反应(PCR)是基于扩增子的微生物群落分析中的关键步骤,可从环境样本或宿主相关样本中选择性扩增标记基因,如16S rRNA。尽管PCR被广泛应用,但已知其会引入扩增偏差,即由于引物-模板错配、序列GC含量和二级结构等因素,某些DNA序列比其他序列更易被优先扩增。虽然已知这些偏差会影响转录本丰度,但其对生态指标的影响仍知之甚少。在本研究中,我们全面评估了PCR偏差如何影响样本内(α多样性)和样本间(β多样性)分析。我们表明,扰动不变多样性度量不受PCR偏差影响,但广泛使用的指标,如香农多样性和加权非加权 UniFrac 是敏感的,其值会根据真实群落组成而变化。为解决此问题,我们提供了关于PCR诱导偏差如何在生态分析和群落结构中变化的理论和实证见解,并就何时应应用偏差校正方法提供了实用指导。我们的研究结果强调了为基于PCR的微生物生态学工作流程选择合适的多样性指标的重要性,并为提高多样性分析的可靠性提供了指导。