Wallis Fleur S A, Baker-Hernandez John L, van Tuil Marc, van Hamersveld Claudia, Koudijs Marco J, Verwiel Eugène T P, Janse Alex, Hiemcke-Jiwa Laura S, de Krijger Ronald R, Kranendonk Mariëtte E G, Vermeulen Marijn A, Wesseling Pieter, Flucke Uta E, de Haas Valérie, Luesink Maaike, Hoving Eelco W, Vormoor Josef H, van Noesel Max M, Hehir-Kwa Jayne Y, Tops Bastiaan B J, Kemmeren Patrick, Kester Lennart A
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands.
Princess Máxima Center for Paediatric Oncology, Utrecht, the Netherlands; Department of Pathology, UMC Utrecht, Utrecht, the Netherlands.
EBioMedicine. 2025 Jan;111:105506. doi: 10.1016/j.ebiom.2024.105506. Epub 2024 Dec 21.
With many rare tumour types, acquiring the correct diagnosis is a challenging but crucial process in paediatric oncology. Historically, this is done based on histology and morphology of the disease. However, advances in genome wide profiling techniques such as RNA sequencing now allow the development of molecular classification tools.
Here, we present M&M, a pan-paediatric cancer ensemble-based machine learning algorithm tailored towards inclusion of rare tumour types.
The RNA-seq based algorithm can classify 52 different tumour types (precision ∼99%, recall ∼80%), plus the underlying 96 tumour subtypes (precision ∼96%, recall ∼70%). For low-confidence classifications, a comparable precision is achieved when including the three highest-scoring labels. We then validated M&M on an internal dataset (precision 99%, recall 76%) and an external dataset from the KidsFirst initiative (precision 98%, recall 77%). Finally, we show that M&M has similar performance as existing disease or domain specific classification algorithms based on RNA sequencing or methylation data.
M&M's pan-cancer setup allows for easy clinical implementation, requiring only one classifier for all incoming diagnostic samples, including samples from different tumour stages and treatment statuses. Simultaneously, its performance is comparable to existing tumour- and tissue-specific classifiers. The introduction of an extensive pan-cancer classifier in diagnostics has the potential to increase diagnostic accuracy for many paediatric cancer cases, thereby contributing towards optimal patient survival and quality of life.
Financial support was provided by the Foundation Children Cancer Free (KiKa core funding) and Adessium Foundation.
对于许多罕见肿瘤类型而言,获得正确诊断在儿科肿瘤学中是一个具有挑战性但至关重要的过程。从历史上看,这是基于疾病的组织学和形态学来完成的。然而,诸如RNA测序等全基因组分析技术的进步现在使得分子分类工具的开发成为可能。
在此,我们展示了M&M,这是一种基于泛儿科癌症集合的机器学习算法,专为纳入罕见肿瘤类型而设计。
基于RNA测序的算法可以对52种不同的肿瘤类型进行分类(精确率约99%,召回率约80%),以及潜在的96种肿瘤亚型(精确率约96%,召回率约70%)。对于低置信度分类,纳入得分最高的三个标签时可实现相当的精确率。然后,我们在一个内部数据集(精确率99%,召回率76%)和来自KidsFirst计划的一个外部数据集(精确率98%,召回率77%)上对M&M进行了验证。最后,我们表明M&M与基于RNA测序或甲基化数据的现有疾病或领域特定分类算法具有相似的性能。
M&M的泛癌设置便于临床实施,对于所有传入的诊断样本,包括来自不同肿瘤阶段和治疗状态的样本,只需要一个分类器。同时,其性能与现有的肿瘤和组织特异性分类器相当。在诊断中引入广泛的泛癌分类器有可能提高许多儿科癌症病例的诊断准确性,从而有助于实现最佳的患者生存率和生活质量。
由儿童无癌基金会(KiKa核心资金)和阿德西姆基金会提供资金支持。