Long Erping, Yin Jinhu, Shin Ju Hye, Li Yuyan, Kane Alexander, Patel Harsh, Luong Thong, Xia Jun, Han Younghun, Byun Jinyoung, Zhang Tongwu, Zhao Wei, Landi Maria Teresa, Rothman Nathaniel, Lan Qing, Chang Yoon Soo, Yu Fulong, Amos Christopher, Shi Jianxin, Lee Jin Gu, Kim Eun Young, Choi Jiyeon
Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
Current affiliation: Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
bioRxiv. 2023 Sep 26:2023.09.25.559336. doi: 10.1101/2023.09.25.559336.
Genome-wide association studies (GWAS) identified over fifty loci associated with lung cancer risk. However, the genetic mechanisms and target genes underlying these loci are largely unknown, as most risk-associated-variants might regulate gene expression in a context-specific manner. Here, we generated a barcode-shared transcriptome and chromatin accessibility map of 117,911 human lung cells from age/sex-matched ever- and never-smokers to profile context-specific gene regulation. Accessible chromatin peak detection identified cell-type-specific candidate -regulatory elements (cCREs) from each lung cell type. Colocalization of lung cancer candidate causal variants (CCVs) with these cCREs prioritized the variants for 68% of the GWAS loci, a subset of which was also supported by transcription factor abundance and footprinting. cCRE colocalization and single-cell based trait relevance score nominated epithelial and immune cells as the main cell groups contributing to lung cancer susceptibility. Notably, cCREs of rare proliferating epithelial cell types, such as AT2-proliferating (0.13%) and basal cells (1.8%), overlapped with CCVs, including those in . A multi-level cCRE-gene linking system identified candidate susceptibility genes from 57% of lung cancer loci, including those not detected in tissue- or cell-line-based approaches. cCRE-gene linkage uncovered that adjacent genes expressed in different cell types are correlated with distinct subsets of coinherited CCVs, including and at the 11q23.3 locus. Our data revealed the cell types and contexts where the lung cancer susceptibility genes are functional.
全基因组关联研究(GWAS)确定了五十多个与肺癌风险相关的基因座。然而,这些基因座背后的遗传机制和靶基因在很大程度上尚不清楚,因为大多数与风险相关的变体可能以上下文特异性的方式调节基因表达。在这里,我们生成了来自年龄/性别匹配的曾经吸烟者和从不吸烟者的117,911个人类肺细胞的条形码共享转录组和染色质可及性图谱,以描绘上下文特异性的基因调控。可及染色质峰检测从每种肺细胞类型中鉴定出细胞类型特异性候选调控元件(cCRE)。肺癌候选因果变体(CCV)与这些cCRE的共定位确定了68%的GWAS基因座的变体优先级,其中一部分也得到了转录因子丰度和足迹分析的支持。cCRE共定位和基于单细胞的性状相关性评分将上皮细胞和免疫细胞确定为导致肺癌易感性的主要细胞群体。值得注意的是,罕见的增殖上皮细胞类型,如AT2增殖细胞(0.13%)和基底细胞(1.8%)的cCRE与CCV重叠,包括那些在......中的CCV。一个多层次的cCRE-基因连接系统从57%的肺癌基因座中鉴定出候选易感基因,包括那些在基于组织或细胞系的方法中未检测到的基因。cCRE-基因连接揭示了在不同细胞类型中表达的相邻基因与共遗传CCV的不同子集相关,包括11q23.3位点的......和......。我们的数据揭示了肺癌易感基因发挥功能的细胞类型和背景。