Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland.
Department of Medicine, University of Maryland School of Medicine, Baltimore, Maryland.
Cancer. 2019 Jun 15;125(12):2076-2088. doi: 10.1002/cncr.32020. Epub 2019 Mar 13.
Although cell lines are an essential resource for studying cancer biology, many are of unknown ancestral origin, and their use may not be optimal for evaluating the biology of all patient populations.
An admixture analysis was performed using genome-wide chip data from the Catalogue of Somatic Mutations in Cancer (COSMIC) Cell Lines Project to calculate genetic ancestry estimates for 1018 cancer cell lines. After stratifying the analyses by tissue and histology types, linear models were used to evaluate the influence of ancestry on gene expression and somatic mutation frequency.
For the 701 cell lines with unreported ancestry, 215 were of East Asian origin, 30 were of African or African American origin, and 453 were of European origin. Notable imbalances were observed in ancestral representation across tissue type, with the majority of analyzed tissue types having few cell lines of African American ancestral origin, and with Hispanic and South Asian ancestry being almost entirely absent across all cell lines. In evaluating gene expression across these cell lines, expression levels of the genes neurobeachin line 1 (NBEAL1), solute carrier family 6 member 19 (SLC6A19), HEAT repeat containing 6 (HEATR6), and epithelial cell transforming 2 like (ECT2L) were associated with ancestry. Significant differences were also observed in the proportions of somatic mutation types across cell lines with varying ancestral proportions.
By estimating genetic ancestry for 1018 cancer cell lines, the authors have produced a resource that cancer researchers can use to ensure that their cell lines are ancestrally representative of the populations they intend to affect. Furthermore, the novel ancestry-specific signal identified underscores the importance of ancestral awareness when studying cancer.
虽然细胞系是研究癌症生物学的重要资源,但许多细胞系的起源未知,并且它们的使用可能并不适合评估所有患者群体的生物学特性。
使用来自癌症体细胞突变目录(COSMIC)细胞系项目的全基因组芯片数据进行混合分析,计算 1018 种癌细胞系的遗传祖先估计值。在按组织和组织学类型分层分析后,使用线性模型评估祖先对基因表达和体细胞突变频率的影响。
对于 701 种未报告祖先的细胞系,215 种来自东亚,30 种来自非洲或非裔美国人,453 种来自欧洲。在组织类型方面,祖先的代表性存在明显的不平衡,大多数分析的组织类型中,非裔美国人的细胞系数量较少,而西班牙裔和南亚裔的细胞系在所有细胞系中几乎完全不存在。在评估这些细胞系的基因表达时,神经海滩素 1 基因(NBEAL1)、溶质载体家族 6 成员 19(SLC6A19)、热重复包含 6(HEATR6)和上皮细胞转化 2 样(ECT2L)的基因表达水平与祖先有关。在具有不同祖先比例的细胞系中,体细胞突变类型的比例也存在显著差异。
通过估计 1018 种癌细胞系的遗传祖先,作者提供了一种资源,癌症研究人员可以使用该资源确保他们的细胞系在种族上代表他们想要影响的人群。此外,新发现的与祖先相关的信号强调了在研究癌症时意识到祖先的重要性。