Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, United States of America.
Division of Biochemical Toxicology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, United States of America.
PLoS One. 2022 Jan 31;17(1):e0263070. doi: 10.1371/journal.pone.0263070. eCollection 2022.
As a common medium-throughput technique, qPCR (quantitative real-time polymerase chain reaction) is widely used to measure levels of nucleic acids. In addition to accurate and complete data, experimenters have unavoidably observed some incomplete and uncertainly determined qPCR data because of intrinsically low overall amounts of biological materials, such as nucleic acids present in biofluids. When there are samples with uncertainly determined qPCR data, some investigators apply the statistical complete-case method by excluding the subset of samples with uncertainly determined data from analysis (CO), while others simply choose not to analyze (CNA) these datasets altogether. To include as many observations as possible in analysis for interesting differential changes between groups, some investigators set incomplete observations equal to the maximum quality qPCR cycle (MC), such as 32 and 40. Although straightforward, these methods may decrease the sample size, skew the data distribution, and compromise statistical power and research reproducibility across replicate qPCR studies. To overcome the shortcomings of the existing, commonly-used qPCR data analysis methods and to join the efforts in advancing statistical analysis in rigorous preclinical research, we propose a robust nonparametric statistical cycle-to-threshold method (CTOT) to analyze incomplete qPCR data for two-group comparisons. CTOT incorporates important characteristics of qPCR data and time-to-event statistical methodology, resulting in a novel analytical method for qPCR data that is built around good quality data from all subjects, certainly determined or not. Considering the benchmark full data (BFD), we compared the abilities of CTOT, CO, MC, and CNA statistical methods to detect interesting differential changes between groups with informative but uncertainly determined qPCR data. Our simulations and applications show that CTOT improves the power of detecting and confirming differential changes in many situations over the three commonly used methods without excess type I errors. The robust nonparametric statistical method of CTOT helps leverage qPCR technology and increase the power to detect differential changes that may assist decision making with respect to biomarker detection and early diagnosis, with the goal of improving the management of patient healthcare.
作为一种常见的高通量技术,qPCR(实时定量聚合酶链反应)广泛用于测量核酸水平。除了准确和完整的数据外,由于生物材料(如生物体液中的核酸)总量较低,实验人员不可避免地会观察到一些不完全和不确定的 qPCR 数据。当存在不确定的 qPCR 数据样本时,一些研究人员通过从分析中排除不确定数据样本子集(CO)来应用统计完全案例方法,而另一些研究人员则选择完全不分析(CNA)这些数据集。为了在组间感兴趣的差异变化分析中尽可能多地包含观察结果,一些研究人员将不完全观察结果设置为最大质量 qPCR 循环(MC),例如 32 和 40。尽管这些方法简单直接,但它们可能会减少样本量、使数据分布偏斜,并损害跨重复 qPCR 研究的统计功效和研究可重复性。为了克服现有常用 qPCR 数据分析方法的缺点,并为推进严格临床前研究中的统计分析做出贡献,我们提出了一种稳健的非参数统计循环到阈值方法(CTOT)来分析两组比较的不完全 qPCR 数据。CTOT 结合了 qPCR 数据和时间到事件统计方法的重要特征,为 qPCR 数据提供了一种新颖的分析方法,该方法围绕所有受试者的高质量数据构建,无论是确定的还是不确定的。考虑到基准全数据(BFD),我们比较了 CTOT、CO、MC 和 CNA 统计方法在具有信息但不确定的 qPCR 数据的组间检测有趣的差异变化的能力。我们的模拟和应用表明,CTOT 在许多情况下提高了检测和确认组间差异变化的功效,超过了三种常用方法,而没有过度的 I 型错误。CTOT 的稳健非参数统计方法有助于利用 qPCR 技术并提高检测差异变化的能力,这可能有助于做出有关生物标志物检测和早期诊断的决策,从而改善患者医疗保健的管理。