Department of Human Oncology, University of Wisconsin, Madison.
Carbone Cancer Center, University of Wisconsin, Madison; Department of Medicine, University of Wisconsin, Madison, USA.
Ann Oncol. 2023 Sep;34(9):813-825. doi: 10.1016/j.annonc.2023.06.001. Epub 2023 Jun 16.
The isolation of cell-free DNA (cfDNA) from the bloodstream can be used to detect and analyze somatic alterations in circulating tumor DNA (ctDNA), and multiple cfDNA-targeted sequencing panels are now commercially available for Food and Drug Administration (FDA)-approved biomarker indications to guide treatment. More recently, cfDNA fragmentation patterns have emerged as a tool to infer epigenomic and transcriptomic information. However, most of these analyses used whole-genome sequencing, which is insufficient to identify FDA-approved biomarker indications in a cost-effective manner.
We used machine learning models of fragmentation patterns at the first coding exon in standard targeted cancer gene cfDNA sequencing panels to distinguish between cancer and non-cancer patients, as well as the specific tumor type and subtype. We assessed this approach in two independent cohorts: a published cohort from GRAIL (breast, lung, and prostate cancers, non-cancer, n = 198) and an institutional cohort from the University of Wisconsin (UW; breast, lung, prostate, bladder cancers, n = 320). Each cohort was split 70%/30% into training and validation sets.
In the UW cohort, training cross-validated accuracy was 82.1%, and accuracy in the independent validation cohort was 86.6% despite a median ctDNA fraction of only 0.06. In the GRAIL cohort, to assess how this approach performs in very low ctDNA fractions, training and independent validation were split based on ctDNA fraction. Training cross-validated accuracy was 80.6%, and accuracy in the independent validation cohort was 76.3%. In the validation cohort where the ctDNA fractions were all <0.05 and as low as 0.0003, the cancer versus non-cancer area under the curve was 0.99.
To our knowledge, this is the first study to demonstrate that sequencing from targeted cfDNA panels can be utilized to analyze fragmentation patterns to classify cancer types, dramatically expanding the potential capabilities of existing clinically used panels at minimal additional cost.
从血液中分离无细胞 DNA(cfDNA)可用于检测和分析循环肿瘤 DNA(ctDNA)中的体细胞改变,现在有多种 cfDNA 靶向测序试剂盒可用于获得美国食品和药物管理局(FDA)批准的生物标志物适应症,以指导治疗。最近,cfDNA 碎片化模式已成为推断表观基因组和转录组信息的一种工具。然而,这些分析大多使用全基因组测序,这不足以以具有成本效益的方式识别 FDA 批准的生物标志物适应症。
我们使用标准靶向癌症基因 cfDNA 测序试剂盒中第一个编码外显子的碎片化模式的机器学习模型,来区分癌症和非癌症患者,以及特定的肿瘤类型和亚型。我们在两个独立的队列中评估了这种方法:来自 GRAIL 的已发表队列(乳腺癌、肺癌和前列腺癌、非癌症,n=198)和威斯康星大学(UW;乳腺癌、肺癌、前列腺癌、膀胱癌,n=320)的机构队列。每个队列都分为 70%/30%的训练集和验证集。
在 UW 队列中,训练集交叉验证准确率为 82.1%,在独立验证队列中的准确率为 86.6%,尽管 ctDNA 分数仅为 0.06。在 GRAIL 队列中,为了评估这种方法在非常低的 ctDNA 分数下的表现,根据 ctDNA 分数对训练集和独立验证集进行了拆分。训练集交叉验证准确率为 80.6%,在独立验证队列中的准确率为 76.3%。在 ctDNA 分数均<0.05 且低至 0.0003 的验证队列中,癌症与非癌症的曲线下面积为 0.99。
据我们所知,这是第一项表明,靶向 cfDNA 试剂盒的测序可用于分析碎片化模式以对癌症类型进行分类的研究,这极大地扩展了现有临床应用试剂盒的潜在能力,而成本增加可忽略不计。