Division of Hematopathology and Transfusion Medicine, University Health Network, Toronto, ON, Canada.
Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada.
Mod Pathol. 2020 Oct;33(10):1874-1888. doi: 10.1038/s41379-020-0547-7. Epub 2020 May 15.
Classification of cancers by tissue-of-origin is fundamental to diagnostic pathology. While the combination of clinical data, tissue histology, and immunohistochemistry is usually sufficient, there remains a small but not insignificant proportion of difficult-to-classify cases. These challenging cases provide justification for ancillary molecular testing, including high-throughput DNA methylation array profiling, which promises cell-of-origin information and compatibility with formalin-fixed specimens. While diagnostically powerful, methylation profiling platforms are costly and technically challenging to implement, particularly for less well-resourced laboratories. To address this, we simulated the performance of "minimalist" methylation-based tests for cancer classification using publicly-available and internal institutional profiling data. These analyses showed that small and focused sets of the most informative CpG biomarkers from the arrays are sufficient for accurate diagnoses. As an illustrative example, one classifier, using information from just 53 out of about 450,000 available CpG probes, achieved an accuracy of 94.5% on 2575 fresh primary validation cases across 28 cancer types from The Cancer Genome Atlas Network. By training minimalist classifiers on formalin-fixed primary and metastatic cases, generally high accuracies were also achieved on additional datasets. These results support the potential of minimalist methylation testing, possibly via quantitative PCR and targeted next-generation sequencing platforms, in cancer classification.
基于组织起源的癌症分类是诊断病理学的基础。虽然临床数据、组织病理学和免疫组织化学的结合通常是足够的,但仍有一小部分但并非不重要的难以分类的病例。这些具有挑战性的病例为辅助分子检测提供了依据,包括高通量 DNA 甲基化阵列分析,它有望提供细胞起源信息,并与福尔马林固定的标本兼容。虽然在诊断上具有强大的功能,但甲基化分析平台成本高昂,技术上难以实施,特别是对于资源较少的实验室。为了解决这个问题,我们使用公开可用的和内部机构的分析数据模拟了基于最小化甲基化的癌症分类测试的性能。这些分析表明,从数组中选择最具信息量的少数几个 CpG 生物标志物就足以进行准确的诊断。作为一个说明性的例子,一个分类器仅使用来自约 450,000 个可用 CpG 探针中的 53 个探针的信息,在来自癌症基因组图谱网络的 28 种癌症类型的 2575 个新鲜原发性验证病例上实现了 94.5%的准确率。通过在福尔马林固定的原发性和转移性病例上训练最小化分类器,在其他数据集上也实现了较高的准确率。这些结果支持了最小化甲基化检测在癌症分类中的潜在应用,可能通过定量 PCR 和靶向下一代测序平台。