Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA.
Department of Computer Science, University of Georgia, Athens, GA, 30602, USA.
BMC Med Genomics. 2021 Nov 24;14(1):281. doi: 10.1186/s12920-021-01122-7.
BACKGROUND & AIMS: Cancer metastasis into distant organs is an evolutionarily selective process. A better understanding of the driving forces endowing proliferative plasticity of tumor seeds in distant soils is required to develop and adapt better treatment systems for this lethal stage of the disease. To this end, we aimed to utilize transcript expression profiling features to predict the site-specific metastases of primary tumors and second, to identify the determinants of tissue specific progression.
We used statistical machine learning for transcript feature selection to optimize classification and built tree-based classifiers to predict tissue specific sites of metastatic progression.
We developed a novel machine learning architecture that analyzes 33 types of RNA transcriptome profiles from The Cancer Genome Atlas (TCGA) database. Our classifier identifies the tumor type, derives synthetic instances of primary tumors metastasizing to distant organs and classifies the site-specific metastases in 16 types of cancers metastasizing to 12 locations.
We have demonstrated that site specific metastatic progression is predictable using transcriptomic profiling data from primary tumors and that the overrepresented biological processes in tumors metastasizing to congruent distant loci are highly overlapping. These results indicate site-specific progression was organotropic and core features of biological signaling pathways are identifiable that may describe proliferative plasticity in distant soils.
癌症转移到远处器官是一个进化选择的过程。为了开发和适应更好的治疗系统来治疗这种疾病的致命阶段,我们需要更好地了解赋予肿瘤种子在远处土壤中增殖可塑性的驱动力。为此,我们旨在利用转录表达谱特征来预测原发性肿瘤的特定部位转移,并确定组织特异性进展的决定因素。
我们使用统计机器学习进行转录特征选择,以优化分类,并构建基于树的分类器来预测转移性进展的组织特异性部位。
我们开发了一种新的机器学习架构,分析了来自癌症基因组图谱(TCGA)数据库的 33 种 RNA 转录组谱类型。我们的分类器可以识别肿瘤类型,从原发肿瘤中派生合成实例,转移到远处器官,并对 16 种转移到 12 个部位的癌症进行特定部位的转移分类。
我们已经证明,使用原发性肿瘤的转录组谱数据可以预测特定部位的转移进展,并且转移到一致的远处位置的肿瘤中过度表达的生物学过程高度重叠。这些结果表明,特定部位的进展是器官特异性的,并且可以识别出描述远处土壤中增殖可塑性的生物信号通路的核心特征。