Carrio-Cordo Paula, Acheson Elise, Huang Qingyao, Baudis Michael
Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.
Swiss Institute of Bioinformatics, Zurich, Switzerland.
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa009.
Cancers arise from the accumulation of somatic genome mutations, which can be influenced by inherited genomic variants and external factors such as environmental or lifestyle-related exposure. Due to the heterogeneity of cancers, precise information about the genomic composition of germline and malignant tissues has to be correlated with morphological, clinical and extrinsic features to advance medical knowledge and treatment options. With global differences in cancer frequencies and disease types, geographic data is of importance to understand the interplay between genetic ancestry and environmental influence in cancer incidence, progression and treatment outcome. In this study, we analyzed the current landscape of oncogenomic screening publications for geographic information content and quality, to address underrepresented study populations and thereby to fill prominent gaps in our understanding of interactions between somatic variations, population genetics and environmental factors in oncogenesis. We conclude that while the use of proxy-derived geographic annotations can be useful for coarse-grained associations, the study of geo-correlated factors in cancer causation and progression will benefit from standardized geographic provenance annotations. Additionally, publication-derived geographic provenance data allowed us to highlight stark inequality in the geographies of cancer genome profiling, with a near lack of sizable studies from Africa and other large regions.
癌症源于体细胞基因组突变的积累,这些突变可能受到遗传基因组变异以及外部因素的影响,如环境或生活方式相关暴露。由于癌症的异质性,必须将有关生殖系和恶性组织基因组组成的精确信息与形态学、临床和外在特征相关联,以推进医学知识和治疗选择。鉴于癌症发病率和疾病类型存在全球差异,地理数据对于理解遗传血统与环境对癌症发病率、进展和治疗结果的影响之间的相互作用至关重要。在本研究中,我们分析了肿瘤基因组筛查出版物中地理信息的内容和质量现状,以解决研究人群代表性不足的问题,从而填补我们在理解肿瘤发生过程中体细胞变异、群体遗传学和环境因素之间相互作用方面的显著空白。我们得出结论,虽然使用代理衍生的地理注释对于粗粒度关联可能有用,但癌症因果关系和进展中地理相关因素的研究将受益于标准化的地理来源注释。此外,出版物衍生的地理来源数据使我们能够突出癌症基因组分析地理方面的严重不平等,非洲和其他大地区几乎缺乏大规模研究。