Suppr超能文献

基于机器学习方法的泛癌多组学生物标志物数据整合,用于新型混合亚组鉴定。

Integration of pan-cancer multi-omics data for novel mixed subgroup identification using machine learning methods.

机构信息

Department of Electrical Engineering, Indian Institute of Technology Dharwad, Dharwad, Karnataka, India.

Department of Biosciences and Bioengineering, Indian Institute of Technology Dharwad, Dharwad, Karnataka, India.

出版信息

PLoS One. 2023 Oct 19;18(10):e0287176. doi: 10.1371/journal.pone.0287176. eCollection 2023.

Abstract

Cancer is a heterogeneous disease, and patients with tumors from different organs can share similar epigenetic and genetic alterations. Therefore, it is crucial to identify the novel subgroups of patients with similar molecular characteristics. It is possible to propose a better treatment strategy when the heterogeneity of the patient is accounted for during subgroup identification, irrespective of the tissue of origin. This work proposes a machine learning (ML) based pipeline for subgroup identification in pan-cancer. Here, mRNA, miRNA, DNA methylation, and protein expression features from pan-cancer samples were concatenated and non-linearly projected to a lower dimension using an ML algorithm. This data was then clustered to identify multi-omics-based novel subgroups. The clinical characterization of these ML subgroups indicated significant differences in overall survival (OS) and disease-free survival (DFS) (p-value<0.0001). The subgroups formed by the patients from different tumors shared similar molecular alterations in terms of immune microenvironment, mutation profile, and enriched pathways. Further, decision-level and feature-level fused classification models were built to identify the novel subgroups for unseen samples. Additionally, the classification models were used to obtain the class labels for the validation samples, and the molecular characteristics were verified. To summarize, this work identified novel ML subgroups using multi-omics data and showed that the patients with different tumor types could be similar molecularly. We also proposed and validated the classification models for subgroup identification. The proposed classification models can be used to identify the novel multi-omics subgroups, and the molecular characteristics of each subgroup can be used to design appropriate treatment regimen.

摘要

癌症是一种异质性疾病,来自不同器官的肿瘤患者可能具有相似的表观遗传和遗传改变。因此,识别具有相似分子特征的新型患者亚组至关重要。在亚组识别中考虑患者的异质性时,可以提出更好的治疗策略,而与组织来源无关。本工作提出了一种基于机器学习(ML)的泛癌亚组识别方法。在这里,使用 ML 算法将来自泛癌样本的 mRNA、miRNA、DNA 甲基化和蛋白质表达特征连接并非线性地投影到较低维度。然后对这些数据进行聚类以识别基于多组学的新型亚组。这些 ML 亚组的临床特征分析表明,在总生存(OS)和无病生存(DFS)方面存在显著差异(p 值<0.0001)。来自不同肿瘤的患者形成的亚组在免疫微环境、突变谱和富集途径方面具有相似的分子改变。此外,还构建了决策级和特征级融合分类模型,以识别未见样本的新型亚组。此外,还使用分类模型为验证样本获取类别标签,并验证分子特征。总之,本工作使用多组学数据识别了新型 ML 亚组,并表明不同肿瘤类型的患者在分子上可能相似。我们还提出并验证了用于亚组识别的分类模型。所提出的分类模型可用于识别新型多组学亚组,并且每个亚组的分子特征可用于设计合适的治疗方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0b2/10586677/4ac453498f42/pone.0287176.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验