Sun Peishuo, Wu Ying, Yin Chaoyi, Jiang Hongyang, Xu Ying, Sun Huiyan
School of Artificial Intelligence, Jilin University, Changchun, China.
Phase I Clinical Trails Center, The First Affiliated Hospital, China Medical University, Shenyang, China.
Front Genet. 2022 May 2;13:866005. doi: 10.3389/fgene.2022.866005. eCollection 2022.
Molecular subtyping of cancer is recognized as a critical and challenging step towards individualized therapy. Most existing computational methods solve this problem multi-classification of gene-expressions of cancer samples. Although these methods, especially deep learning, perform well in data classification, they usually require large amounts of data for model training and have limitations in interpretability. Besides, as cancer is a complex systemic disease, the phenotypic difference between cancer samples can hardly be fully understood by only analyzing single molecules, and differential expression-based molecular subtyping methods are reportedly not conserved. To address the above issues, we present here a new framework for molecular subtyping of cancer through identifying a robust specific co-expression module for each subtype of cancer, generating network features for each sample by perturbing correlation levels of specific edges, and then training a deep neural network for multi-class classification. When applied to breast cancer (BRCA) and stomach adenocarcinoma (STAD) molecular subtyping, it has superior classification performance over existing methods. In addition to improving classification performance, we consider the specific co-expressed modules selected for subtyping to be biologically meaningful, which potentially offers new insight for diagnostic biomarker design, mechanistic studies of cancer, and individualized treatment plan selection.
癌症的分子亚型分类被认为是迈向个体化治疗的关键且具有挑战性的一步。大多数现有的计算方法通过对癌症样本的基因表达进行多分类来解决这个问题。尽管这些方法,尤其是深度学习,在数据分类方面表现良好,但它们通常需要大量数据进行模型训练,并且在可解释性方面存在局限性。此外,由于癌症是一种复杂的系统性疾病,仅通过分析单个分子很难完全理解癌症样本之间的表型差异,而且据报道基于差异表达的分子亚型分类方法并不保守。为了解决上述问题,我们在此提出一种新的癌症分子亚型分类框架,即通过为每种癌症亚型识别一个稳健的特定共表达模块,通过扰动特定边的相关水平为每个样本生成网络特征,然后训练深度神经网络进行多分类。当应用于乳腺癌(BRCA)和胃腺癌(STAD)分子亚型分类时,它比现有方法具有更优的分类性能。除了提高分类性能外,我们认为为亚型分类选择的特定共表达模块具有生物学意义,这可能为诊断生物标志物设计、癌症机制研究和个体化治疗方案选择提供新的见解。