Iwagami Masao, Ishimaru Miho, Takeuchi Yoshinori, Shinozaki Tomohiro
Department of Digital Health, Institute of Medicine, University of Tsukuba.
International Institute for Integrative Sleep Medicine (IIIS), University of Tsukuba.
J Epidemiol. 2025 Apr 5;35(4):161-169. doi: 10.2188/jea.JE20240245. Epub 2025 Feb 28.
In epidemiological or clinical studies with follow-ups, data tables generated and processed for statistical analysis are often of the "wide-format" type, consisting of one row per individual. However, depending on the situation and purpose of the study, they may need to be transformed into the "long-format" type, which allows for multiple rows per individual. This tutorial clarifies the typical situations wherein researchers are recommended to split follow-up times to generate long-format data tables. In such applications, the major analytical aims consist of (i) estimating the outcome incidence rates or their ratios between ≥2 groups, according to specific follow-up time periods; (ii) examining the interaction between the exposure status and follow-up time to assess the proportional hazards assumption in Cox models; (iii) dealing with time-varying exposures for descriptive or predictive purposes; (iv) estimating the causal effects of time-varying exposures while adjusting for time-varying confounders that may be affected by past exposures; and (v) comparing different time periods within the same individual in self-controlled case-series analyses. This tutorial also discusses how to split follow-up times according to their purposes in practical settings, providing example codes in Stata, R, and SAS.
在有随访的流行病学或临床研究中,为进行统计分析而生成和处理的数据表通常是“宽格式”的,即每个个体占一行。然而,根据研究的情况和目的,可能需要将其转换为“长格式”,即每个个体可以有多行。本教程阐明了建议研究人员拆分随访时间以生成长格式数据表的典型情况。在此类应用中,主要分析目标包括:(i)根据特定随访时间段估计≥2组之间的结局发生率或其比率;(ii)检查暴露状态与随访时间之间的相互作用,以评估Cox模型中的比例风险假设;(iii)处理随时间变化的暴露以进行描述或预测;(iv)在调整可能受既往暴露影响的随时间变化的混杂因素的同时,估计随时间变化的暴露的因果效应;以及(v)在自控病例系列分析中比较同一受试者内的不同时间段。本教程还讨论了如何在实际环境中根据目的拆分随访时间,并提供了Stata、R和SAS中的示例代码。