Zhang Yujia, Crawford Sara, Boulet Sheree L, Monsour Michael, Cohen Bruce, McKane Patricia, Freeman Karen
Division of Reproductive Health, Centers for Disease Control and Prevention, Atlanta, GA.
Massachusetts Department of Public Health, Boston.
J Mod Appl Stat Methods. 2017;16(1):744-752. doi: 10.22237/jmasm/1493599140.
Temporal changes in methods for collecting longitudinal data can generate inconsistent distributions of affected variables, but effects on parameter estimates have not been well described. We examined differences in Apgar scores of infants born in 2000-2006 to women with ovulatory dysfunction (risk) or tubal obstruction (reference) who underwent assisted reproductive technology (ART), using Florida, Massachusetts, and Michigan birth certificate data linked to the Centers for Disease Control and Prevention's National ART Surveillance System database. Florida had inconsistent information on induction of labor (a control variable) from a 2004 change in birth certificate format. Because we wanted to control for bias that may be introduced by the inconsistent distribution of labor induction in analysis, we used multiple imputation data in analysis. We used Cox-Iannacchione weighted sequential hot deck method to conduct multiple imputation for the labor induction values in Florida data collected before this change, and missing values in Florida data collected after the change and overall Massachusetts and Michigan data. The adjusted odds ratios for low Apgar score were 1.94 (95% confidence interval [CI] 1.32-2.85) using imputed induction of labor and 1.83 (95% CI 1.20-2.80) using not imputed induction of labor. Compared with the estimate from multiple imputation, the estimate obtained using not imputed induction of labor was biased towards the null with inflated standard errors, but the magnitude of differences was small.
收集纵向数据的方法随时间变化可能会导致受影响变量的分布不一致,但对参数估计的影响尚未得到充分描述。我们利用与疾病控制和预防中心国家辅助生殖技术监测系统数据库相链接的佛罗里达州、马萨诸塞州和密歇根州的出生证明数据,研究了2000年至2006年出生的、其母亲患有排卵功能障碍(风险组)或输卵管阻塞(参照组)并接受辅助生殖技术(ART)的婴儿的阿氏评分差异。佛罗里达州因2004年出生证明格式的变化,在引产(一个控制变量)方面存在信息不一致的情况。由于我们希望在分析中控制因引产分布不一致可能引入的偏差,因此在分析中使用了多重填补数据。我们采用Cox-Iannacchione加权序贯热卡方法,对佛罗里达州在此变化之前收集的数据中的引产值、变化之后收集的佛罗里达州数据中的缺失值以及马萨诸塞州和密歇根州的整体数据进行多重填补。使用填补后的引产数据得出的低阿氏评分的调整比值比为1.94(95%置信区间[CI] 1.32 - 2.85),而使用未填补引产数据得出的调整比值比为1.83(95% CI 1.20 - 2.80)。与多重填补得到的估计值相比,使用未填补引产数据得到的估计值偏向无效值,且标准误增大,但差异幅度较小。