Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands.
Department of Cell and Chemical Biology, Leiden University Medical Center, Leiden, The Netherlands.
PLoS Comput Biol. 2019 Feb 20;15(2):e1006826. doi: 10.1371/journal.pcbi.1006826. eCollection 2019 Feb.
Based on morphology it is often challenging to distinguish between the many different soft tissue sarcoma subtypes. Moreover, outcome of disease is highly variable even between patients with the same disease. Machine learning on transcriptome sequencing data could be a valuable new tool to understand differences between and within entities. Here we used machine learning analysis to identify novel diagnostic and prognostic markers and therapeutic targets for soft tissue sarcomas. Gene expression data was used from the Cancer Genome Atlas, the Genotype-Tissue Expression project and the French Sarcoma Group. We identified three groups of tumors that overlap in their molecular profiles as seen with unsupervised t-Distributed Stochastic Neighbor Embedding clustering and a deep neural network. The three groups corresponded to subtypes that are morphologically overlapping. Using a random forest algorithm, we identified novel diagnostic markers for soft tissue sarcoma that distinguished between synovial sarcoma and MPNST, and that we validated using qRT-PCR in an independent series. Next, we identified prognostic genes that are strong predictors of disease outcome when used in a k-nearest neighbor algorithm. The prognostic genes were further validated in expression data from the French Sarcoma Group. One of these, HMMR, was validated in an independent series of leiomyosarcomas using immunohistochemistry on tissue micro array as a prognostic gene for disease-free interval. Furthermore, reconstruction of regulatory networks combined with data from the Connectivity Map showed, amongst others, that HDAC inhibitors could be a potential effective therapy for multiple soft tissue sarcoma subtypes. A viability assay with two HDAC inhibitors confirmed that both leiomyosarcoma and synovial sarcoma are sensitive to HDAC inhibition. In this study we identified novel diagnostic markers, prognostic markers and therapeutic leads from multiple soft tissue sarcoma gene expression datasets. Thus, machine learning algorithms are powerful new tools to improve our understanding of rare tumor entities.
基于形态学,通常很难区分许多不同的软组织肉瘤亚型。此外,即使是同一疾病的患者,疾病的预后也存在很大差异。基于转录组测序数据的机器学习可能是理解实体之间和内部差异的一种有价值的新工具。在这里,我们使用机器学习分析来识别软组织肉瘤的新诊断和预后标志物以及治疗靶点。使用来自癌症基因组图谱、基因型-组织表达项目和法国肉瘤组的基因表达数据。我们使用无监督 t 分布随机邻域嵌入聚类和深度神经网络识别出在分子谱上重叠的三组肿瘤。这三组肿瘤与形态上重叠的亚型相对应。使用随机森林算法,我们鉴定出用于软组织肉瘤的新型诊断标志物,可区分滑膜肉瘤和 MPNST,并使用 qRT-PCR 在独立系列中进行验证。接下来,我们使用 k-最近邻算法鉴定出预后基因,这些基因是疾病结果的强预测因子。预后基因在法国肉瘤组的表达数据中进一步验证。其中之一,HMMR,在独立的平滑肌肉瘤系列中使用组织微阵列的免疫组织化学进行验证,作为无病间隔的预后基因。此外,监管网络的重建与连接图谱的数据相结合,表明除其他外,HDAC 抑制剂可能是多种软组织肉瘤亚型的潜在有效治疗方法。使用两种 HDAC 抑制剂进行的活力测定证实,平滑肌肉瘤和滑膜肉瘤均对 HDAC 抑制敏感。在这项研究中,我们从多个软组织肉瘤基因表达数据集确定了新型诊断标志物、预后标志物和治疗靶点。因此,机器学习算法是提高我们对罕见肿瘤实体理解的强大新工具。
Genes Chromosomes Cancer. 2012-7-18
Asian Pac J Cancer Prev. 2018-3-27
Adv Exp Med Biol. 2015
Cancer Immunol Immunother. 2025-7-2
Int J Mol Sci. 2024-9-25
BMC Genomics. 2024-8-19
Nat Commun. 2018-1-10
Elife. 2017-12-5
Mol Cancer Ther. 2017-9-6
Science. 2017-8-18