Suppr超能文献

持久性有机污染物与子宫内膜异位症之间的关联:基于机器学习算法的多污染物评估。

Associations between persistent organic pollutants and endometriosis: A multipollutant assessment using machine learning algorithms.

机构信息

LABERCA, Oniris, INRAE, 44307, Nantes, France.

StatSC, ONIRIS, INRAE, Nantes, France.

出版信息

Environ Pollut. 2020 May;260:114066. doi: 10.1016/j.envpol.2020.114066. Epub 2020 Jan 28.

Abstract

Endometriosis is a gynaecological disease characterised by the presence of endometriotic tissue outside of the uterus impacting a significant fraction of women of childbearing age. Evidence from epidemiological studies suggests a relationship between risk of endometriosis and exposure to some organochlorine persistent organic pollutants (POPs). However, these chemicals are numerous and occur in complex and highly correlated mixtures, and to date, most studies have not accounted for this simultaneous exposure. Linear and logistic regression models are constrained to adjusting for multiple exposures when variables are highly intercorrelated, resulting in unstable coefficients and arbitrary findings. Advanced machine learning models, of emerging use in epidemiology, today appear as a promising option to address these limitations. In this study, different machine learning techniques were compared on a dataset from a case-control study conducted in France to explore associations between mixtures of POPs and deep endometriosis. The battery of models encompassed regularised logistic regression, artificial neural network, support vector machine, adaptive boosting, and partial least-squares discriminant analysis with some additional sparsity constraints. These techniques were applied to identify the biomarkers of internal exposure in adipose tissue most associated with endometriosis and to compare model classification performance. The five tested models revealed a consistent selection of most associated POPs with deep endometriosis, including octachlorodibenzofuran, cis-heptachlor epoxide, polychlorinated biphenyl 77 or trans-nonachlor, among others. The high classification performance of all five models confirmed that machine learning may be a promising complementary approach in modelling highly correlated exposure biomarkers and their associations with health outcomes. Regularised logistic regression provided a good compromise between the interpretability of traditional statistical approaches and the classification capacity of machine learning approaches. Applying a battery of complementary algorithms may be a strategic approach to decipher complex exposome-health associations when the underlying structure is unknown.

摘要

子宫内膜异位症是一种妇科疾病,其特征是子宫内膜组织出现在子宫外,影响了很大一部分育龄妇女。来自流行病学研究的证据表明,子宫内膜异位症的风险与接触某些有机氯持久性有机污染物(POPs)之间存在关联。然而,这些化学物质数量众多,存在于复杂且高度相关的混合物中,迄今为止,大多数研究都没有考虑到这种同时暴露。当变量高度相关时,线性和逻辑回归模型会受到限制,只能调整多个暴露因素,从而导致系数不稳定和任意结果。在流行病学中,新兴的高级机器学习模型今天似乎是解决这些限制的有希望的选择。在这项研究中,在法国进行的一项病例对照研究的数据集上比较了不同的机器学习技术,以探索 POPs 混合物与深部子宫内膜异位症之间的关联。该模型组包括正则化逻辑回归、人工神经网络、支持向量机、自适应增强和偏最小二乘判别分析,以及一些额外的稀疏性约束。这些技术被应用于识别与子宫内膜异位症最相关的脂肪组织内暴露生物标志物,并比较模型分类性能。五种测试模型揭示了与深部子宫内膜异位症最相关的大多数 POPs 的一致选择,包括八氯二苯并呋喃、顺式-七氯环氧化物、多氯联苯 77 或反式-壬氯等。所有五种模型的高分类性能均证实,机器学习可能是一种很有前途的补充方法,可用于对高度相关的暴露生物标志物及其与健康结果的关联进行建模。正则化逻辑回归在传统统计方法的可解释性和机器学习方法的分类能力之间提供了良好的折衷。应用一系列互补的算法可能是一种策略方法,可以在未知潜在结构的情况下,解析复杂的暴露组学-健康关联。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验