Gal Jocelyn, Bailleux Caroline, Chardin David, Pourcher Thierry, Gilhodes Julia, Jing Lun, Guigonis Jean-Marie, Ferrero Jean-Marc, Milano Gerard, Mograbi Baharia, Brest Patrick, Chateau Yann, Humbert Olivier, Chamorey Emmanuel
University Côte d'Azur, Epidemiology and Biostatistics Department, Centre Antoine Lacassagne, Nice F-06189, France.
University Côte d'Azur, Medical Oncology Department Centre Antoine Lacassagne, Nice F-06189, France.
Comput Struct Biotechnol J. 2020 Jun 3;18:1509-1524. doi: 10.1016/j.csbj.2020.05.021. eCollection 2020.
Genomics and transcriptomics have led to the widely-used molecular classification of breast cancer (BC). However, heterogeneous biological behaviors persist within breast cancer subtypes. Metabolomics is a rapidly-expanding field of study dedicated to cellular metabolisms affected by the environment. The aim of this study was to compare metabolomic signatures of BC obtained by 5 different unsupervised machine learning (ML) methods. Fifty-two consecutive patients with BC with an indication for adjuvant chemotherapy between 2013 and 2016 were retrospectively included. We performed metabolomic profiling of tumor resection samples using liquid chromatography-mass spectrometry. Here, four hundred and forty-nine identified metabolites were selected for further analysis. Clusters obtained using 5 unsupervised ML methods (PCA k-means, sparse k-means, spectral clustering, SIMLR and k-sparse) were compared in terms of clinical and biological characteristics. With an optimal partitioning parameter k = 3, the five methods identified three prognosis groups of patients (favorable, intermediate, unfavorable) with different clinical and biological profiles. SIMLR and K-sparse methods were the most effective techniques in terms of clustering. survival analysis revealed a significant difference for 5-year predicted OS between the 3 clusters. Further pathway analysis using the 449 selected metabolites showed significant differences in amino acid and glucose metabolism between BC histologic subtypes. Our results provide proof-of-concept for the use of unsupervised ML metabolomics enabling stratification and personalized management of BC patients. The design of novel computational methods incorporating ML and bioinformatics techniques should make available tools particularly suited to improving the outcome of cancer treatment and reducing cancer-related mortalities.
基因组学和转录组学已促成了广泛应用的乳腺癌(BC)分子分类。然而,乳腺癌亚型中仍存在异质性生物学行为。代谢组学是一个快速发展的研究领域,致力于研究受环境影响的细胞代谢。本研究的目的是比较通过5种不同的无监督机器学习(ML)方法获得的BC代谢组学特征。回顾性纳入了2013年至2016年间连续52例有辅助化疗指征的BC患者。我们使用液相色谱 - 质谱法对肿瘤切除样本进行了代谢组学分析。在此,选择了449种已鉴定的代谢物进行进一步分析。比较了使用5种无监督ML方法(主成分分析(PCA)k均值、稀疏k均值、谱聚类、SIMLR和k稀疏)获得的聚类在临床和生物学特征方面的差异。在最佳划分参数k = 3时,这五种方法识别出了具有不同临床和生物学特征的三个患者预后组(良好、中等、不良)。就聚类而言,SIMLR和K稀疏方法是最有效的技术。生存分析显示3个聚类之间5年预测总生存期存在显著差异。使用449种选定代谢物进行的进一步通路分析表明,BC组织学亚型之间在氨基酸和葡萄糖代谢方面存在显著差异。我们的结果为使用无监督ML代谢组学实现BC患者分层和个性化管理提供了概念验证。结合ML和生物信息学技术的新型计算方法的设计应提供特别适合改善癌症治疗结果和降低癌症相关死亡率的工具。