Frenoy Pauline, Ahmed Ismaïl, Marques Chloé, Ren Xuan, Severi Gianluca, Perduca Vittorio, Mancini Francesca Romana
Inserm, Gustave Roussy, Centre for Research in Epidemiology and Population Health (CESP), "Exposome, Heredity, Cancer, and Health" Team, Université Paris-Saclay, UVSQ, 12 Avenue Paul Vaillant Couturier, 94805, Villejuif, France.
Inserm, CESP, Université Paris-Saclay, UVSQ, Villejuif, France.
Sci Rep. 2025 Jan 15;15(1):2058. doi: 10.1038/s41598-025-85438-9.
Persistent organic pollutants (POPs) are a group of organic chemical compounds. Contradictory results have emerged in epidemiological studies attempting to elucidate their relationship with breast cancer risk. This study explored the relationship between dietary exposures to multiple POPs and ER-positive breast cancer risk in the French E3N cohort study, using three different approaches to handle multicollinearity among exposures. Intakes of 81 POPs were estimated using food consumption data from a validated semi-quantitative food frequency questionnaire and food contamination data. In the first approach, hierarchical clustering was performed to identify clusters of correlated POPs. For each cluster, the levels of POPs belonging to it were averaged. These average levels were then included in a Cox model to estimate their associations with ER-positive breast cancer occurrence. The second and third approaches applied in the present study were Principal component Cox regression (PCR-Cox) and partial least squares Cox regression (PLS-Cox) respectively, both being dimension-reduction methods (respectively unsupervised and supervised) coupled to a Cox model, used to identify principal components of POPs and to estimate their associations with ER-positive breast occurrence. All models were adjusted for potential confounders previously identified using a directed acyclic graph. The study included 66,722 women with a median follow-up of 20.3 years, during which 3,739 developed an incident ER-positive breast cancer. The variable clustering method did not identify any association between the averaged variables and ER-positive breast cancer risk. Five components were retained using both the PCR-Cox and PLS-Cox methods explaining 82% and 77% of the variance in the initial exposure matrix respectively. Among these components, none was significantly associated with the occurrence of ER-positive breast cancer. This study provides an illustrative example of the application of three distinct statistical methods in the context of highly correlated environmental exposures, discussing their potential relevance and limitations within this specific framework.
持久性有机污染物(POPs)是一类有机化合物。在试图阐明其与乳腺癌风险关系的流行病学研究中出现了相互矛盾的结果。本研究在法国E3N队列研究中,采用三种不同方法处理暴露因素之间的多重共线性,探讨了饮食中多种持久性有机污染物暴露与雌激素受体(ER)阳性乳腺癌风险之间的关系。利用经过验证的半定量食物频率问卷中的食物消费数据和食物污染数据,估算了81种持久性有机污染物的摄入量。在第一种方法中,进行层次聚类以识别相关持久性有机污染物的聚类。对于每个聚类,将属于该聚类的持久性有机污染物水平进行平均。然后将这些平均水平纳入Cox模型,以估计它们与ER阳性乳腺癌发生的关联。本研究采用的第二种和第三种方法分别是主成分Cox回归(PCR-Cox)和偏最小二乘Cox回归(PLS-Cox),这两种方法都是与Cox模型相结合的降维方法(分别为无监督和有监督),用于识别持久性有机污染物的主成分并估计它们与ER阳性乳腺癌发生的关联。所有模型均针对先前使用有向无环图确定的潜在混杂因素进行了调整。该研究纳入了66722名女性,中位随访时间为20.3年,在此期间有3739人发生了新发ER阳性乳腺癌。变量聚类方法未发现平均变量与ER阳性乳腺癌风险之间存在任何关联。使用PCR-Cox和PLS-Cox方法均保留了五个成分,分别解释了初始暴露矩阵中82%和77%的方差。在这些成分中,没有一个与ER阳性乳腺癌的发生显著相关。本研究提供了一个在高度相关的环境暴露背景下应用三种不同统计方法的示例,讨论了它们在这个特定框架内的潜在相关性和局限性。