Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden.
PLoS One. 2013 Nov 19;8(11):e80080. doi: 10.1371/journal.pone.0080080. eCollection 2013.
Both genetic and environmental factors are important for the development of allergic diseases. However, a detailed understanding of how such factors act together is lacking. To elucidate the interplay between genetic and environmental factors in allergic diseases, we used a novel bioinformatics approach that combines feature selection and machine learning. In two materials, PARSIFAL (a European cross-sectional study of 3113 children) and BAMSE (a Swedish birth-cohort including 2033 children), genetic variants as well as environmental and lifestyle factors were evaluated for their contribution to allergic phenotypes. Monte Carlo feature selection and rule based models were used to identify and rank rules describing how combinations of genetic and environmental factors affect the risk of allergic diseases. Novel interactions between genes were suggested and replicated, such as between ORMDL3 and RORA, where certain genotype combinations gave odds ratios for current asthma of 2.1 (95% CI 1.2-3.6) and 3.2 (95% CI 2.0-5.0) in the BAMSE and PARSIFAL children, respectively. Several combinations of environmental factors appeared to be important for the development of allergic disease in children. For example, use of baby formula and antibiotics early in life was associated with an odds ratio of 7.4 (95% CI 4.5-12.0) of developing asthma. Furthermore, genetic variants together with environmental factors seemed to play a role for allergic diseases, such as the use of antibiotics early in life and COL29A1 variants for asthma, and farm living and NPSR1 variants for allergic eczema. Overall, combinations of environmental and life style factors appeared more frequently in the models than combinations solely involving genes. In conclusion, a new bioinformatics approach is described for analyzing complex data, including extensive genetic and environmental information. Interactions identified with this approach could provide useful hints for further in-depth studies of etiological mechanisms and may also strengthen the basis for risk assessment and prevention.
遗传和环境因素对于过敏性疾病的发展都很重要。然而,人们对于这些因素如何共同作用的了解还很缺乏。为了阐明过敏性疾病中遗传和环境因素的相互作用,我们使用了一种新的生物信息学方法,该方法结合了特征选择和机器学习。在两个材料中,PARSIFAL(一项包含 3113 名儿童的欧洲横断面研究)和 BAMSE(一项包含 2033 名儿童的瑞典出生队列研究),评估了遗传变异以及环境和生活方式因素对过敏表型的贡献。蒙特卡罗特征选择和基于规则的模型用于识别和排序描述遗传和环境因素组合如何影响过敏性疾病风险的规则。还提出并复制了基因之间的新相互作用,例如 ORMDL3 和 RORA 之间的相互作用,在 BAMSE 和 PARSIFAL 儿童中,某些基因型组合导致当前哮喘的比值比分别为 2.1(95%CI 1.2-3.6)和 3.2(95%CI 2.0-5.0)。一些环境因素的组合似乎对儿童过敏性疾病的发展很重要。例如,婴儿配方奶粉和抗生素的早期使用与哮喘的比值比为 7.4(95%CI 4.5-12.0)。此外,遗传变异与环境因素似乎共同影响过敏性疾病,例如早期生活中使用抗生素和 COL29A1 变异与哮喘有关,而农场生活和 NPSR1 变异与过敏性湿疹有关。总的来说,与仅涉及基因的组合相比,环境和生活方式因素的组合在模型中出现的频率更高。总之,描述了一种用于分析复杂数据的新生物信息学方法,包括广泛的遗传和环境信息。通过这种方法确定的相互作用可以为进一步深入研究病因机制提供有用的线索,也可以为风险评估和预防提供更坚实的基础。