Department of Pediatrics, Division of Neonatology, Stanford University School of Medicine, Stanford, California.
Department of Obstetrics, Gynecology and Reproductive Sciences, University of California, San Francisco, California.
Birth Defects Res. 2019 Mar 1;111(4):212-221. doi: 10.1002/bdr2.1441. Epub 2018 Dec 26.
To generate new leads about risk factors for gastroschisis, a birth defect that has been increasing in prevalence over time, we performed an untargeted data mining statistical approach.
Using data exclusively from the California Center of the National Birth Defects Prevention Study, we compared 286 cases of gastroschisis and 1,263 non-malformed, live-born controls. All infants had delivery dates between October 1997 and December 2011 and were stratified by maternal age at birth (<20 and ≥ 20 years). Cases and controls were compared by maternal responses to 183 questions (219 variables) using random forest, a data mining procedure. Variables deemed important by random forest were included in logistic regression models to estimate odds ratios and 95% confidence intervals.
Among women younger than 20, of variables deemed important, there were higher odds observed for higher consumption of chocolate, low intake of iron, acetaminophen use, and urinary tract infections during the beginning of pregnancy. After adjustment, the higher odds remained for low iron intake and a urinary tract infection in the first month of pregnancy. Among women aged 20 or older, of variables deemed important, higher odds were observed for US-born women of Hispanic ethnicity and for parental substance abuse. There were lower odds observed for obese women, women who ate any cereal the month before pregnancy, and those with higher parity.
We did not discover many previously unreported associations, despite our novel approach to generate new hypotheses. However, our results do add evidence to some previously proposed risk factors.
为了发现先天性腹裂(一种发病率随时间推移而增加的出生缺陷)的风险因素,我们采用无目标数据挖掘统计方法。
我们仅使用来自加利福尼亚国家出生缺陷预防研究中心的数据,比较了 286 例先天性腹裂病例和 1263 例非畸形活产对照。所有婴儿的分娩日期均在 1997 年 10 月至 2011 年 12 月之间,按母亲分娩时的年龄(<20 岁和≥20 岁)分层。采用随机森林(一种数据挖掘程序)比较病例和对照者的 183 个问题(219 个变量)的母亲回答。随机森林认为重要的变量被纳入逻辑回归模型,以估计比值比和 95%置信区间。
在<20 岁的女性中,认为重要的变量中,较高的巧克力摄入量、铁摄入量低、怀孕早期使用对乙酰氨基酚和尿路感染与较高的患病风险有关。调整后,怀孕早期铁摄入量低和尿路感染的患病风险仍然较高。在 20 岁或以上的女性中,认为重要的变量中,具有西班牙裔的美国出生女性和父母滥用药物的患病风险较高。肥胖女性、怀孕前一个月食用任何谷物的女性和多胎妊娠的女性患病风险较低。
尽管我们采用了新的方法来生成新的假说,但我们并未发现许多以前未报道过的关联。然而,我们的结果确实为一些以前提出的风险因素增加了证据。