Karmakar Bikram, Zauber Ann G, Hahn Anne I, Lau Yan Kwan, Doubeni Chyke A, Joffe Marshall M
Department of Statistics, College of Liberal Arts and Sciences, University of Florida, Gainesville, FL, USA.
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Int J Epidemiol. 2024 Jun 12;53(4). doi: 10.1093/ije/dyae096.
Observational studies are frequently used to estimate the comparative effectiveness of different colorectal cancer (CRC) screening methods due to the practical limitations and time needed to conduct large clinical trials. However, time-varying confounders, e.g. polyp detection in the last screening, can bias statistical results. Recently, generalized methods, or G-methods, have been used for the analysis of observational studies of CRC screening, given their ability to account for such time-varying confounders. Discretization, or the process of converting continuous functions into discrete counterparts, is required for G-methods when the treatment and outcomes are assessed at a continuous scale.
This paper evaluates the interplay between time-varying confounding and discretization, which can induce bias in assessing screening effectiveness. We investigate this bias in evaluating the effect of different CRC screening methods that differ from each other in typical screening frequency.
First, using theory, we establish the direction of the bias. Then, we use simulations of hypothetical settings to study the bias magnitude for varying levels of discretization, frequency of screening and length of the study period. We develop a method to assess possible bias due to coarsening in simulated situations.
The proposed method can inform future studies of screening effectiveness, especially for CRC, by determining the choice of interval lengths where data are discretized to minimize bias due to coarsening while balancing computational costs.
由于开展大型临床试验存在实际限制和所需时间,观察性研究经常被用于估计不同结直肠癌(CRC)筛查方法的比较效果。然而,随时间变化的混杂因素,例如上次筛查中息肉的检测情况,可能会使统计结果产生偏差。近来,广义方法(G-方法)已被用于结直肠癌筛查观察性研究的分析,因为其有能力考虑此类随时间变化的混杂因素。当治疗和结局以连续尺度评估时,G-方法需要进行离散化,即将连续函数转换为离散对应物的过程。
本文评估了随时间变化的混杂因素与离散化之间的相互作用,这可能会在评估筛查效果时导致偏差。我们在评估典型筛查频率不同的各种结直肠癌筛查方法的效果时,研究了这种偏差。
首先,通过理论我们确定了偏差的方向。然后,我们使用假设情境的模拟来研究不同离散化水平、筛查频率和研究周期长度下的偏差幅度。我们开发了一种方法来评估模拟情境中由于数据粗化可能产生的偏差。
所提出的方法可以通过确定数据离散化时的区间长度选择,在平衡计算成本的同时将因粗化导致的偏差降至最低,从而为未来关于筛查效果的研究提供参考,尤其是针对结直肠癌的研究。