Dwivedi Alok K, Elhanafi Sherif E, Othman Mohamed O, Zuckerman Marc J
Division of Biostatistics & Epidemiology, Department of Molecular and Translational Medicine, Paul L. Foster School of Medicine, Texas Tech University Health Science Center, El Paso, Texas, United States.
Division of Gastroenterology, Department of Internal Medicine, Paul L. Foster School of Medicine, Texas Tech University Health Science Center, El Paso, Texas, United States.
PeerJ. 2025 May 26;13:e19504. doi: 10.7717/peerj.19504. eCollection 2025.
Colon cancer screening studies are needed for the early detection of colorectal polyps to reduce the risk of colorectal cancer. Unfortunately, the data generated on colon polyps are typically analyzed in their dichotomized form and sometimes with standard count models, which leads to potentially inaccurate findings in research studies. A more appropriate approach for evaluating colon polyps is zero-inflated models, considering undetected existing polyps at colonoscopy screening.
We demonstrated the application of the zero-inflated and hurdle models including zero-inflated Poisson (ZIP), zero-inflated robust Poisson (ZIRP), zero-inflated negative binomial (ZINB), zero-inflated generalized Poisson (ZIGP), zero hurdle Poisson (ZHP), and zero hurdle negative binomial (ZHNB) models, and compared them with standard approaches including logistic regression (LR), Poisson regression (PR), robust Poisson (RP), and negative binomial (NB) regression for the evaluation of colorectal polyps using datasets from two randomized studies and one observational study. We also facilitated a step-by-step approach for selecting appropriate models for analyzing polyp data.
All datasets yielded a significant amount of no polyps and therefore inflated or hurdle models performed best over single distribution models. We showed that cap-assisted colonoscopy yielded significantly more colon polyps (risk ratio [RR] = 1.38; 95% confidence interval [CI] [1.05-1.81]) compared with the standard colonoscopy by using the ZIP analysis. However, these findings were missed by standard analytic methods, including LR (odds ratio [OR] = 0.90; 95% CI [0.59-1.37]), PR (RR = 1.14; 95% CI [0.93-1.41]), and NB (RR = 1.16; 95% CI [0.89-1.51]) for evaluating colon polyps. The standard approaches, such as LR, PR, RP, or NB regressions for analyzing polyp data, produced potentially inaccurate findings compared to zero-inflated models in all example datasets. Furthermore, simulation studies also confirmed the superiority of ZIRP over alternative models in a range of datasets differing from the case studies. ZIRP was found to be the optimal method for analyzing polyp data in randomized studies, while the ZINB/ZHNB model showed a better fit in an observational study.
We suggest colonoscopy studies should jointly use the polyp detection rate and polyp counts as the quality measure. Based on theoretical, empirical, and simulation considerations, we encourage analysts to utilize zero-inflated models for evaluating colorectal polyps in colonoscopy screening studies for proper clinical interpretation of data and accurate reporting of findings. A similar approach can also be used for analyzing other types of polyp counts in colonoscopy studies.
为了早期发现结直肠息肉以降低结直肠癌风险,需要开展结肠癌筛查研究。遗憾的是,关于结肠息肉产生的数据通常以二分法形式进行分析,有时还使用标准计数模型,这在研究中可能导致潜在的不准确结果。考虑到结肠镜检查筛查中未检测到的现有息肉,评估结肠息肉更合适的方法是零膨胀模型。
我们展示了零膨胀模型和障碍模型的应用,包括零膨胀泊松(ZIP)模型、零膨胀稳健泊松(ZIRP)模型、零膨胀负二项式(ZINB)模型、零膨胀广义泊松(ZIGP)模型、零障碍泊松(ZHP)模型和零障碍负二项式(ZHNB)模型,并将它们与标准方法进行比较,标准方法包括逻辑回归(LR)、泊松回归(PR)、稳健泊松(RP)和负二项式(NB)回归,使用来自两项随机研究和一项观察性研究的数据集来评估结直肠息肉。我们还推动了一种逐步选择合适模型来分析息肉数据的方法。
所有数据集都产生了大量无息肉的情况,因此膨胀或障碍模型比单一分布模型表现最佳。我们发现,通过使用ZIP分析,与标准结肠镜检查相比,帽辅助结肠镜检查产生的结肠息肉明显更多(风险比[RR]=1.38;95%置信区间[CI][1.05 - 1.81])。然而,包括LR(优势比[OR]=0.90;95%CI[0.59 - 1.37])、PR(RR = 1.14;95%CI[0.93 - 1.41])和NB(RR = 1.16;95%CI[0.89 - 1.51])在内的标准分析方法在评估结肠息肉时遗漏了这些发现。与零膨胀模型相比,用于分析息肉数据的标准方法,如LR、PR、RP或NB回归,在所有示例数据集中都产生了潜在的不准确结果。此外,模拟研究也证实了在一系列与案例研究不同的数据集中,ZIRP比其他替代模型更具优势。在随机研究中,ZIRP被发现是分析息肉数据的最佳方法,而ZINB/ZHNB模型在观察性研究中显示出更好的拟合度。
我们建议结肠镜检查研究应联合使用息肉检出率和息肉计数作为质量指标。基于理论、实证和模拟考虑,我们鼓励分析人员在结肠镜检查筛查研究中使用零膨胀模型来评估结直肠息肉,以便对数据进行恰当的临床解读并准确报告结果。类似的方法也可用于分析结肠镜检查研究中其他类型的息肉计数。