Research and Development Unit, Parc Sanitari Sant Joan de Déu, Fundació Sant Joan de Déu, CIBERSAM, Dr. Antoni Pujadas, 42, Sant Boi de Llobregat, 08830 Barcelona, Spain.
School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6140, New Zealand.
Int J Environ Res Public Health. 2018 Jan 10;15(1):106. doi: 10.3390/ijerph15010106.
Current studies of gene × air pollution interaction typically seek to identify unknown heritability of common complex illnesses arising from variability in the host's susceptibility to environmental pollutants of interest. Accordingly, a single component generalized linear models are often used to model the risk posed by an environmental exposure variable of interest in relation to a priori determined DNA variants. However, reducing the phenotypic heterogeneity may further optimize such approach, primarily represented by the modeled DNA variants. Here, we reduce phenotypic heterogeneity of asthma severity, and also identify single nucleotide polymorphisms (SNP) associated with phenotype subgroups. Specifically, we first apply an unsupervised learning algorithm method and a non-parametric regression to find a biclustering structure of children according to their allergy and asthma severity. We then identify a set of SNPs most closely correlated with each sub-group. We subsequently fit a logistic regression model for each group against the healthy controls using benzo[]pyrene (B[]P) as a representative airborne carcinogen. Application of such approach in a case-control data set shows that SNP clustering may help to partly explain heterogeneity in children's asthma susceptibility in relation to ambient B[]P concentration with greater efficiency.
目前关于基因与空气污染相互作用的研究通常旨在确定由宿主对感兴趣的环境污染物的敏感性的变异性引起的常见复杂疾病的未知遗传率。因此,通常使用单个成分广义线性模型来对与先验确定的 DNA 变体相关的感兴趣的环境暴露变量所带来的风险进行建模。然而,降低表型异质性可能会进一步优化这种方法,主要表现为所建模的 DNA 变体。在这里,我们降低了哮喘严重程度的表型异质性,并确定了与表型亚组相关的单核苷酸多态性(SNP)。具体来说,我们首先应用无监督学习算法方法和非参数回归,根据过敏和哮喘严重程度为儿童找到一个双聚类结构。然后,我们确定与每个亚组最密切相关的一组 SNP。然后,我们使用苯并[a]芘(B[a]P)作为代表性空气致癌剂,针对每个组相对于健康对照组拟合逻辑回归模型。该方法在病例对照数据集上的应用表明,SNP 聚类可能有助于部分解释与环境 B[a]P 浓度相关的儿童哮喘易感性的异质性,效率更高。