Duell Eric J, Bracci Paige M, Moore Jason H, Burk Robert D, Kelsey Karl T, Holly Elizabeth A
International Agency for Research Cancer, 150 Cours Albert Thomas, 69008 Lyon, France.
Cancer Epidemiol Biomarkers Prev. 2008 Jun;17(6):1470-9. doi: 10.1158/1055-9965.EPI-07-2797.
Data mining and data reduction methods to detect interactions in epidemiologic data are being developed and tested. In these analyses, multifactor dimensionality reduction, focused interaction testing framework, and traditional logistic regression models were used to identify potential interactions with up to three factors. These techniques were used in a population-based case-control study of pancreatic cancer from the San Francisco Bay Area (308 cases, 964 controls). From 7 biochemical pathways, along with tobacco smoking, 26 polymorphisms in 20 genes were included in these analyses. Combinations of genetic markers and cigarette smoking were identified as potential risk factors for pancreatic cancer, including genes in base excision repair (OGG1), nucleotide excision repair (XPD, XPA, XPC), and double-strand break repair (XRCC3). XPD.751, XPD.312, and cigarette smoking were the best single-factor predictors of pancreatic cancer risk, whereas XRCC3.241smoking and OGG1.326XPC.PAT were the best two-factor predictors. There was some evidence for a three-factor combination of OGG1.326XPD.751smoking, but the covariate-adjusted relative-risk estimates lacked precision. Multifactor dimensionality reduction and focused interaction testing framework showed little concordance, whereas logistic regression allowed for covariate adjustment and model confirmation. Our data suggest that multiple common alleles from DNA repair pathways in combination with cigarette smoking may increase the risk for pancreatic cancer, and that multiple approaches to data screening and analysis are necessary to identify potentially new risk factor combinations.
用于检测流行病学数据中相互作用的数据挖掘和数据缩减方法正在被开发和测试。在这些分析中,多因素降维、聚焦相互作用测试框架和传统逻辑回归模型被用于识别多达三个因素的潜在相互作用。这些技术被用于一项基于旧金山湾区胰腺癌的人群病例对照研究(308例病例,964例对照)。在这些分析中,从7条生化途径以及吸烟情况中,纳入了20个基因中的26个多态性。基因标记与吸烟的组合被确定为胰腺癌的潜在危险因素,包括碱基切除修复(OGG1)、核苷酸切除修复(XPD、XPA、XPC)和双链断裂修复(XRCC3)中的基因。XPD.751、XPD.312和吸烟是胰腺癌风险的最佳单因素预测指标,而XRCC3.241吸烟和OGG1.326XPC.PAT是最佳双因素预测指标。有一些证据支持OGG1.326XPD.751吸烟的三因素组合,但经协变量调整的相对风险估计缺乏精度。多因素降维和聚焦相互作用测试框架显示出的一致性很小,而逻辑回归允许进行协变量调整和模型确认。我们的数据表明,DNA修复途径中的多个常见等位基因与吸烟相结合可能会增加患胰腺癌的风险,并且需要多种数据筛选和分析方法来识别潜在的新危险因素组合。