Li James L, Zanti Maria, Williams Jacob, Jahagirdar Om, Jia Guochong, Turcan Alistair, Hu Qiang, Brandenburg Jean-Tristan, Yan Li, Ho Weang-Kee, Li Jingmei, Miranda José Patricio, Godbole Devika, Dias Julie-Alexia, Zhang Xiaomeng, Dorling Leila, Chen Wenlong Carl, Boddicker Nicholas, Wang Ying, Martin Alicia, Zhang Yan Dora, Dennis Joe, John Esther M, Torres-Mejia Gabriela, Kushi Larry, Weitzel Jeffrey, Neuhausen Susan L, Carvajal-Carmona Luis, Haiman Christopher, Ziv Elad, Fejerman Laura, Zheng Wei, Huo Dezheng, Easton Douglas, Chanock Stephen J, Chatterjee Nilanjan, Kraft Peter, Garcia-Closas Montserrat, Wong Wendy S W, Michailidou Kyriaki, Zhu Qianqian, Zhang Martin Jinye, Dutta Diptavo, Ahearn Thomas U, Zhang Haoyu
Department of Public Health Sciences, University of Chicago, Chicago, IL, USA.
Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus.
medRxiv. 2025 Aug 25:2025.08.20.25334075. doi: 10.1101/2025.08.20.25334075.
Breast cancer genome-wide association studies (GWAS) have identified over 200 independent genome-wide significant susceptibility markers. However, most studies have focused on one or two ancestral groups. We examined breast cancer genetic architecture using GWAS summary statistics from African (AFR), East Asian (EAS), European (EUR) and Hispanic/Latina (H/L) samples, totaling 159,297 cases and 212,102 controls, comprising the largest multi-ancestry study of breast cancer to date. The logit-scale heritability of breast cancer ranged from =0.47 (SE = 0.07) in EAS to AFR =0.61 (SE = 0.10), with no significant differences across ancestries (p=0.63). The estimated number of susceptibility markers in a sparse normal-mixture effects model also varied from 4,446 (SE = 3,100) in EAS to 8,308 (SE = 2,751) in AFR, but differences were not significant across ancestries (p=0.55). Cross-sample genetic correlations varied, with the strongest correlation between EUR and EAS ( = 0.79, SE = 0.08) and weakest between AFR and H/L ( = 0.26, SE = 0.24). Common variants in regulatory elements were enriched for genetic association across samples. By integrating the GWAS summary statistics with the Tabula Sapiens scRNA-seq atlas, we identified ancestry-shared associations between breast cancer and specific cell types, including innate immune cells, secretory epithelial cells and stromal cells. Collectively, these results support a largely shared polygenic architecture of breast cancer across ancestries, with consistent enrichment of common regulatory variants and convergent cellular signatures identified through single-cell analyses.
乳腺癌全基因组关联研究(GWAS)已鉴定出200多个全基因组显著的独立易感标记。然而,大多数研究集中在一两个祖先群体。我们使用来自非洲(AFR)、东亚(EAS)、欧洲(EUR)和西班牙裔/拉丁裔(H/L)样本的GWAS汇总统计数据来研究乳腺癌的遗传结构,共有159,297例病例和212,102例对照,构成了迄今为止最大规模的乳腺癌多祖先研究。乳腺癌的对数尺度遗传力范围从东亚的h² = 0.47(标准误 = 0.07)到非洲的h² = 0.61(标准误 = 0.10),各祖先群体之间无显著差异(p = 0.63)。稀疏正态混合效应模型中估计的易感标记数量也有所不同,从东亚的4446个(标准误 = 3100)到非洲的8308个(标准误 = 2751),但各祖先群体之间的差异不显著(p = 0.55)。跨样本遗传相关性各不相同,欧洲人和东亚人之间的相关性最强(r = 0.79,标准误 = 0.08),非洲人和西班牙裔/拉丁裔之间的相关性最弱(r = 0.26,标准误 = 0.24)。调控元件中的常见变异在各样本中富集了遗传关联。通过将GWAS汇总统计数据与《人类细胞图谱》单细胞RNA测序图谱相结合,我们确定了乳腺癌与特定细胞类型之间的祖先共享关联,包括先天免疫细胞、分泌上皮细胞和基质细胞。总体而言,这些结果支持乳腺癌在很大程度上具有跨祖先的共享多基因结构,通过单细胞分析确定了常见调控变异的一致富集和趋同的细胞特征。