Medical Oncology, Labcorp Oncology, 6 Moore Dr., Durham, NC 27560, United States.
Department of Pathology, Duke University Medical Center, Duke Cancer Institute, 40 Duke Medicine Cir, Durham, NC 27710, United States.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae557.
Disparities in cancer diagnosis, treatment, and outcomes based on self-identified race and ethnicity (SIRE) are well documented, yet these variables have historically been excluded from clinical research. Without SIRE, genetic ancestry can be inferred using single-nucleotide polymorphisms (SNPs) detected from tumor DNA using comprehensive genomic profiling (CGP). However, factors inherent to CGP of tumor DNA increase the difficulty of identifying ancestry-informative SNPs, and current workflows for inferring genetic ancestry from CGP need improvements in key areas of the ancestry inference process. This study used genomic data from 4274 diverse reference subjects and CGP data from 491 patients with solid tumors and SIRE to develop and validate a workflow to obtain accurate genetically inferred ancestry (GIA) from CGP sequencing results. We use consensus-based classification to derive confident ancestral inferences from an expanded reference dataset covering eight world populations (African, Admixed American, Central Asian/Siberian, European, East Asian, Middle Eastern, Oceania, South Asian). Our GIA calls were highly concordant with SIRE (95%) and aligned well with reference populations of inferred ancestries. Further, our workflow could expand on SIRE by (i) detecting the ancestry of patients that usually lack appropriate racial categories, (ii) determining what patients have mixed ancestry, and (iii) resolving ancestries of patients in heterogeneous racial categories and who had missing SIRE. Accurate GIA provides needed information to enable ancestry-aware biomarker research, ensure the inclusion of underrepresented groups in clinical research, and increase the diverse representation of patient populations eligible for precision medicine therapies and trials.
基于自我认定的种族和民族(SIRE)的癌症诊断、治疗和结果差异已有充分记录,但这些变量在历史上一直被排除在临床研究之外。如果没有 SIRE,就可以使用综合基因组分析(CGP)从肿瘤 DNA 中检测到的单核苷酸多态性(SNP)推断遗传祖先。然而,肿瘤 DNA 的 CGP 固有的因素增加了识别与祖先相关的 SNP 的难度,并且从 CGP 推断遗传祖先的当前工作流程需要在祖先推断过程的关键领域进行改进。本研究使用了 4274 名不同参考对象的基因组数据和 491 名实体瘤患者的 CGP 数据和 SIRE,以开发和验证一种从 CGP 测序结果中获得准确的遗传推断祖先(GIA)的工作流程。我们使用基于共识的分类法,从涵盖八个世界人群(非洲、混合美洲、中亚/西伯利亚、欧洲、东亚、中东、大洋洲和南亚)的扩展参考数据集得出可靠的祖先推断。我们的 GIA 调用与 SIRE 高度一致(95%),并且与推断的祖先参考人群很好地吻合。此外,我们的工作流程可以通过以下方式扩展 SIRE:(i)检测通常缺乏适当种族类别的患者的祖先,(ii)确定患者的混合祖先情况,以及(iii)解决混杂种族类别的患者和缺少 SIRE 的患者的祖先问题。准确的 GIA 提供了必要的信息,使基于祖先的生物标志物研究成为可能,确保在临床研究中纳入代表性不足的群体,并增加有资格接受精准医学治疗和试验的患者群体的多样性。