Suppr超能文献

使用DNA体细胞突变数据通过机器学习方法预测癌症组织起源

Predicting Cancer Tissue-of-Origin by a Machine Learning Method Using DNA Somatic Mutation Data.

作者信息

Liu Xiaojun, Li Lianxing, Peng Lihong, Wang Bo, Lang Jidong, Lu Qingqing, Zhang Xizhe, Sun Yi, Tian Geng, Zhang Huajun, Zhou Liqian

机构信息

School of Computer Science, Hunan University of Technology, Zhuzhou, China.

Chifeng Municipal Hospital, Chifeng, China.

出版信息

Front Genet. 2020 Jul 14;11:674. doi: 10.3389/fgene.2020.00674. eCollection 2020.

Abstract

Patients with carcinoma of unknown primary (CUP) account for 3-5% of all cancer cases. A large number of metastatic cancers require further diagnosis to determine their tissue of origin. However, diagnosis of CUP and identification of its primary site are challenging. Previous studies have suggested that molecular profiling of tissue-specific genes could be useful in inferring the primary tissue of a tumor. The purpose of this study was to evaluate the performance somatic mutations detected in a tumor to identify the cancer tissue of origin. We downloaded the somatic mutation datasets from the International Cancer Genome Consortium project. The random forest algorithm was used to extract features, and a classifier was established based on the logistic regression. Specifically, the somatic mutations of 300 genes were extracted, which are significantly enriched in functions, such as cell-to-cell adhesion. In addition, the prediction accuracy on tissue-of-origin inference for 3,374 cancer samples across 13 cancer types reached 81% in a 10-fold cross-validation. Our method could be useful in the identification of cancer tissue of origin, as well as the diagnosis and treatment of cancers.

摘要

原发灶不明癌(CUP)患者占所有癌症病例的3%至5%。大量转移性癌症需要进一步诊断以确定其组织来源。然而,CUP的诊断及其原发部位的识别具有挑战性。先前的研究表明,组织特异性基因的分子谱分析可能有助于推断肿瘤的原发组织。本研究的目的是评估肿瘤中检测到的体细胞突变在识别癌症组织来源方面的性能。我们从国际癌症基因组联盟项目下载了体细胞突变数据集。使用随机森林算法提取特征,并基于逻辑回归建立分类器。具体而言,提取了300个基因的体细胞突变,这些基因在细胞间粘附等功能中显著富集。此外,在10折交叉验证中,对13种癌症类型的3374个癌症样本进行组织来源推断的预测准确率达到了81%。我们的方法可能有助于识别癌症组织来源以及癌症的诊断和治疗。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2329/7372518/ce2fcc27e8f2/fgene-11-00674-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验