Suppr超能文献

使用紧凑特征集将非TCGA癌症样本分类为TCGA分子亚型。

Classification of non-TCGA cancer samples to TCGA molecular subtypes using compact feature sets.

作者信息

Ellrott Kyle, Wong Christopher K, Yau Christina, Castro Mauro A A, Lee Jordan A, Karlberg Brian J, Grewal Jasleen K, Lagani Vincenzo, Tercan Bahar, Friedl Verena, Hinoue Toshinori, Uzunangelov Vladislav, Westlake Lindsay, Loinaz Xavier, Felau Ina, Wang Peggy I, Kemal Anab, Caesar-Johnson Samantha J, Shmulevich Ilya, Lazar Alexander J, Tsamardinos Ioannis, Hoadley Katherine A, Robertson A Gordon, Knijnenburg Theo A, Benz Christopher C, Stuart Joshua M, Zenklusen Jean C, Cherniack Andrew D, Laird Peter W

机构信息

Oregon Health and Science University, Portland, OR 97239, USA.

Biomolecular Engineering Department, School of Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA.

出版信息

Cancer Cell. 2025 Feb 10;43(2):195-212.e11. doi: 10.1016/j.ccell.2024.12.002. Epub 2025 Jan 2.

Abstract

Molecular subtypes, such as defined by The Cancer Genome Atlas (TCGA), delineate a cancer's underlying biology, bringing hope to inform a patient's prognosis and treatment plan. However, most approaches used in the discovery of subtypes are not suitable for assigning subtype labels to new cancer specimens from other studies or clinical trials. Here, we address this barrier by applying five different machine learning approaches to multi-omic data from 8,791 TCGA tumor samples comprising 106 subtypes from 26 different cancer cohorts to build models based upon small numbers of features that can classify new samples into previously defined TCGA molecular subtypes-a step toward molecular subtype application in the clinic. We validate select classifiers using external datasets. Predictive performance and classifier-selected features yield insight into the different machine-learning approaches and genomic data platforms. For each cancer and data type we provide containerized versions of the top-performing models as a public resource.

摘要

分子亚型,如由癌症基因组图谱(TCGA)所定义的那样,描绘了癌症的潜在生物学特性,为了解患者的预后和治疗方案带来了希望。然而,在发现亚型时所使用的大多数方法并不适用于为来自其他研究或临床试验的新癌症标本分配亚型标签。在此,我们通过将五种不同的机器学习方法应用于来自8791个TCGA肿瘤样本的多组学数据来解决这一障碍,这些样本包含来自26个不同癌症队列的106个亚型,以基于少量特征构建模型,这些模型可以将新样本分类到先前定义的TCGA分子亚型中——这是迈向分子亚型在临床中应用的一步。我们使用外部数据集验证选定的分类器。预测性能和分类器选择的特征有助于深入了解不同的机器学习方法和基因组数据平台。对于每种癌症和数据类型,我们提供表现最佳模型的容器化版本作为公共资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95b5/11949768/d23c6a88dd2a/nihms-2046254-f0002.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验