Department of Human Genetics, Laboratory for Cytogenetics and Genome Research, KU Leuven, Leuven, Belgium.
Department of Oncology, Laboratory of Gynecological Oncology, KU Leuven, Leuven, Belgium.
Clin Chem. 2022 Sep 1;68(9):1164-1176. doi: 10.1093/clinchem/hvac095.
Cell-free DNA (cfDNA) analysis holds great promise for non-invasive cancer screening, diagnosis, and monitoring. We hypothesized that mining the patterns of cfDNA shallow whole-genome sequencing datasets from patients with cancer could improve cancer detection.
By applying unsupervised clustering and supervised machine learning on large cfDNA shallow whole-genome sequencing datasets from healthy individuals (n = 367) and patients with different hematological (n = 238) and solid malignancies (n = 320), we identified cfDNA signatures that enabled cancer detection and typing.
Unsupervised clustering revealed cancer type-specific sub-grouping. Classification using a supervised machine learning model yielded accuracies of 96% and 65% in discriminating hematological and solid malignancies from healthy controls, respectively. The accuracy of disease type prediction was 85% and 70% for the hematological and solid cancers, respectively. The potential utility of managing a specific cancer was demonstrated by classifying benign from invasive and borderline adnexal masses with an area under the curve of 0.87 and 0.74, respectively.
This approach provides a generic analytical strategy for non-invasive pan-cancer detection and cancer type prediction.
游离 DNA(cfDNA)分析在癌症的非侵入性筛查、诊断和监测方面具有巨大的潜力。我们假设,对癌症患者的 cfDNA 浅层全基因组测序数据集进行模式挖掘,可以提高癌症的检测能力。
通过对来自健康个体(n=367)和不同血液系统(n=238)和实体恶性肿瘤(n=320)患者的大量 cfDNA 浅层全基因组测序数据集进行无监督聚类和有监督机器学习应用,我们确定了能够进行癌症检测和分型的 cfDNA 特征。
无监督聚类揭示了癌症类型特异性的亚群。使用有监督机器学习模型进行分类,在区分健康对照与血液系统和实体恶性肿瘤方面的准确率分别为 96%和 65%。对血液系统和实体癌症的疾病类型预测的准确率分别为 85%和 70%。通过分别对良性和侵袭性及交界性附件肿块进行分类,曲线下面积为 0.87 和 0.74,证明了对特定癌症进行管理的潜在效用。
该方法为非侵入性泛癌检测和癌症类型预测提供了一种通用的分析策略。