IEEE/ACM Trans Comput Biol Bioinform. 2019 Nov-Dec;16(6):2089-2100. doi: 10.1109/TCBB.2018.2822803. Epub 2018 Apr 4.
The emergence of deep learning has impacted numerous machine learning based applications and research. The reason for its success lies in two main advantages: 1) it provides the ability to learn very complex non-linear relationships between features and 2) it allows one to leverage information from unlabeled data that does not belong to the problem being handled. This paper presents a transfer learning procedure for cancer classification, which uses feature selection and normalization techniques in conjunction with s sparse auto-encoders on gene expression data. While classifying any two tumor types, data of other tumor types were used in unsupervised manner to improve the feature representation. The performance of our algorithm was tested on 36 two-class benchmark datasets from the GEMLeR repository. On performing statistical tests, it is clearly ascertained that our algorithm statistically outperforms several generally used cancer classification approaches. The deep learning based molecular disease classification can be used to guide decisions made on the diagnosis and treatment of diseases, and therefore may have important applications in precision medicine.
深度学习的出现已经影响了许多基于机器学习的应用和研究。它成功的原因在于两个主要优势:1)它提供了学习特征之间非常复杂的非线性关系的能力;2)它允许利用不属于正在处理的问题的未标记数据中的信息。本文提出了一种基于迁移学习的癌症分类方法,该方法在基因表达数据上使用特征选择和归一化技术以及稀疏自编码器。在对任何两种肿瘤类型进行分类时,使用非监督方式使用其他肿瘤类型的数据来改善特征表示。我们的算法在 GEMLeR 存储库中的 36 个两类别基准数据集上进行了测试。通过进行统计检验,可以清楚地确定我们的算法在统计上优于几种常用的癌症分类方法。基于深度学习的分子疾病分类可用于指导疾病诊断和治疗决策,因此在精准医学中可能具有重要应用。