Štancl Paula, Hamel Nancy, Sigel Keith M, Foulkes William D, Karlić Rosa, Polak Paz
Bioinformatics Group, Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, Zagreb, Croatia.
Cancer Research Program, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.
Front Genet. 2022 Jun 17;13:852159. doi: 10.3389/fgene.2022.852159. eCollection 2022.
Gene-agnostic genomic biomarkers were recently developed to identify homologous recombination deficiency (HRD) tumors that are likely to respond to treatment with PARP inhibitors. Two machine-learning algorithms that predict HRD status, CHORD, and HRDetect, utilize various HRD-associated features extracted from whole-genome sequencing (WGS) data and show high sensitivity in detecting patients with bi-allelic inactivation in all cancer types. When using only DNA mutation data for the detection of potential causes of HRD, both HRDetect and CHORD find that 30-40% of cases that have been classified as HRD are due to unknown causes. Here, we examined the impact of tumor-specific thresholds and measurement of promoter methylation of and on unexplained proportions of HRD cases across various tumor types. We gathered published CHORD and HRDetect probability scores for 828 samples from breast, ovarian, and pancreatic cancer from previous studies, as well as evidence of their biallelic inactivation (by either DNA alterations or promoter methylation) in HR-related genes. ROC curve analysis evaluated the performance of each classifier in specific cancer. Tenfold nested cross-validation was used to find the optimal threshold values of HRDetect and CHORD for classifying HR-deficient samples within each cancer type. With the universal threshold, HRDetect has higher sensitivity in the detection of biallelic inactivation in than CHORD and resulted in a higher proportion of unexplained cases. When promoter methylation was excluded, in ovarian carcinoma, the proportion of unexplained cases increased from 26.8 to 48.8% for HRDetect and from 14.7 to 41.2% for CHORD. A similar increase was observed in breast cancer. Applying cancer-type-specific thresholds led to similar sensitivity and specificity for both methods. The cancer-type-specific thresholds for HRDetect reduced the number of unexplained cases from 21 to 12.3% without reducing the 96% sensitivity to known events. For CHORD, unexplained cases were reduced from 10 to 9% while sensitivity increased from 85.3 to 93.9%. These results suggest that WGS-based HRD classifiers should be adjusted for tumor types. When applied, only ∼10% of breast, ovarian, and pancreas cancer cases are not explained by known events in our dataset.
基因非特异性基因组生物标志物最近被开发出来,用于识别可能对PARP抑制剂治疗有反应的同源重组缺陷(HRD)肿瘤。两种预测HRD状态的机器学习算法CHORD和HRDetect,利用从全基因组测序(WGS)数据中提取的各种HRD相关特征,在检测所有癌症类型中双等位基因失活的患者时显示出高灵敏度。当仅使用DNA突变数据来检测HRD的潜在原因时,HRDetect和CHORD都发现,被归类为HRD的病例中有30%-40%是由未知原因导致的。在此,我们研究了肿瘤特异性阈值以及启动子甲基化测量对不同肿瘤类型中HRD病例无法解释比例的影响。我们收集了先前研究中来自乳腺癌、卵巢癌和胰腺癌的828个样本的已发表CHORD和HRDetect概率评分,以及它们在HR相关基因中双等位基因失活的证据(通过DNA改变或启动子甲基化)。ROC曲线分析评估了每个分类器在特定癌症中的性能。使用十折嵌套交叉验证来找到HRDetect和CHORD在每种癌症类型中分类HR缺陷样本的最佳阈值。使用通用阈值时,HRDetect在检测中的双等位基因失活方面比CHORD具有更高的灵敏度,并导致更高比例的无法解释的病例。当排除启动子甲基化时,在卵巢癌中,HRDetect无法解释的病例比例从26.8%增加到48.8%,CHORD从14.7%增加到41.2%。在乳腺癌中也观察到了类似的增加。应用癌症类型特异性阈值导致两种方法具有相似的灵敏度和特异性。HRDetect的癌症类型特异性阈值将无法解释的病例数量从21%减少到12.3%,同时不降低对已知事件96%的灵敏度。对于CHORD,无法解释的病例从10%减少到9%,而灵敏度从85.3%增加到93.9%。这些结果表明,基于WGS的HRD分类器应针对肿瘤类型进行调整。在我们的数据集中,应用时,只有约10%的乳腺癌、卵巢癌和胰腺癌病例无法用已知事件解释。