Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York.
Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, New York.
Cancer Discov. 2024 Jun 3;14(6):1064-1081. doi: 10.1158/2159-8290.CD-23-0996.
Tumor type guides clinical treatment decisions in cancer, but histology-based diagnosis remains challenging. Genomic alterations are highly diagnostic of tumor type, and tumor-type classifiers trained on genomic features have been explored, but the most accurate methods are not clinically feasible, relying on features derived from whole-genome sequencing (WGS), or predicting across limited cancer types. We use genomic features from a data set of 39,787 solid tumors sequenced using a clinically targeted cancer gene panel to develop Genome-Derived-Diagnosis Ensemble (GDD-ENS): a hyperparameter ensemble for classifying tumor type using deep neural networks. GDD-ENS achieves 93% accuracy for high-confidence predictions across 38 cancer types, rivaling the performance of WGS-based methods. GDD-ENS can also guide diagnoses of rare type and cancers of unknown primary and incorporate patient-specific clinical information for improved predictions. Overall, integrating GDD-ENS into prospective clinical sequencing workflows could provide clinically relevant tumor-type predictions to guide treatment decisions in real time.
We describe a highly accurate tumor-type prediction model, designed specifically for clinical implementation. Our model relies only on widely used cancer gene panel sequencing data, predicts across 38 distinct cancer types, and supports integration of patient-specific nongenomic information for enhanced decision support in challenging diagnostic situations. See related commentary by Garg, p. 906. This article is featured in Selected Articles from This Issue, p. 897.
肿瘤类型指导癌症的临床治疗决策,但基于组织学的诊断仍然具有挑战性。基因组改变高度诊断肿瘤类型,并且已经探索了基于基因组特征的肿瘤类型分类器,但最准确的方法在临床上不可行,依赖于源自全基因组测序(WGS)的特征,或者预测有限的癌症类型。我们使用 39787 个使用临床靶向癌症基因面板测序的实体瘤的基因组特征数据集来开发基因组衍生诊断集成(GDD-ENS):一种使用深度神经网络分类肿瘤类型的超参数集成。GDD-ENS 在 38 种癌症类型中实现了 93%的高置信度预测准确率,与基于 WGS 的方法相媲美。GDD-ENS 还可以指导罕见类型和未知原发癌的诊断,并整合患者特定的临床信息以提高预测准确性。总体而言,将 GDD-ENS 集成到前瞻性临床测序工作流程中,可以实时提供具有临床相关性的肿瘤类型预测,以指导治疗决策。
我们描述了一种高度准确的肿瘤类型预测模型,专为临床实施而设计。我们的模型仅依赖于广泛使用的癌症基因面板测序数据,可预测 38 种不同的癌症类型,并支持整合患者特定的非基因组信息,以在具有挑战性的诊断情况下提供增强的决策支持。有关此问题的更多详细信息,请参阅相关评论。本文是本期精选文章的一部分,第 897 页。