Department of Epidemiology and Health Statistics, School of Public Health, Fujian Medical University, Fuzhou, China.
Department of Breast Surgery, Fujian Medical University Union Hospital, Fuzhou, China.
Cancer Med. 2023 Jul;12(14):15504-15514. doi: 10.1002/cam4.6198. Epub 2023 Jun 2.
Despite the rising incidence and mortality of breast cancer among women in China, there are currently few predictive models for breast cancer in the Chinese population and with low accuracy. This study aimed to identify major genetic and life-style risk factors in a Chinese population for potential application in risk assessment models.
A case-control study in southeast China was conducted including 1321 breast cancer patients and 2045 controls during 2013-2016, in which the data were randomly divided into a training set and a test set on a 7:3 scale. The association between genetic and life-style factors and breast cancer was examined using logistic regression models. Using AUC curves, we also compared the performance of the logistic model to machine learning models, namely LASSO regression model and support vector machine (SVM), and the scores calculated from CKB, Gail and Tyrer-Cuzick models in the test set.
Among all factors considered, the best model was achieved when polygenetic risk score, lifestyle, and reproductive factors were considered jointly in the logistic regression model (AUC = 0.73; 95% CI: 0.70-0.77). The models created in this study performed better than those using scores calculated from the CKB, Gail, and Tyrer-Cuzick models. However, the logistic model and machine learning models did not significantly differ from one another.
In summary, we have found genetic and lifestyle risk predictors for breast cancer with moderate discrimination, which might provide reference for breast cancer screening in southeast China. Further population-based studies are needed to validate the model for future applications in personalized breast cancer screening programs.
尽管中国女性乳腺癌的发病率和死亡率不断上升,但目前针对中国人种的乳腺癌预测模型较少,且准确性较低。本研究旨在确定中国人群中主要的遗传和生活方式风险因素,以便潜在地应用于风险评估模型。
2013 年至 2016 年期间,在中国东南部进行了一项病例对照研究,纳入了 1321 名乳腺癌患者和 2045 名对照者,数据按 7:3 的比例随机分为训练集和测试集。使用逻辑回归模型检验遗传和生活方式因素与乳腺癌之间的关联。我们还使用 AUC 曲线比较了逻辑模型与机器学习模型(即 LASSO 回归模型和支持向量机(SVM))以及 CKB、Gail 和 Tyrer-Cuzick 模型在测试集中计算的分数在测试集中的性能。
在所考虑的所有因素中,当多基因风险评分、生活方式和生殖因素联合应用于逻辑回归模型时,获得了最佳模型(AUC=0.73;95%CI:0.70-0.77)。本研究中建立的模型优于使用 CKB、Gail 和 Tyrer-Cuzick 模型计算的分数建立的模型。然而,逻辑模型和机器学习模型之间没有显著差异。
总之,我们发现了具有中等判别能力的乳腺癌遗传和生活方式风险预测因子,这可能为中国东南部的乳腺癌筛查提供参考。需要进一步进行基于人群的研究来验证该模型,以便将来在个性化乳腺癌筛查计划中应用。