Danaee Padideh, Ghaeini Reza, Hendrix David A
School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97330, USA,
Pac Symp Biocomput. 2017;22:219-229. doi: 10.1142/9789813207813_0022.
Cancer detection from gene expression data continues to pose a challenge due to the high dimensionality and complexity of these data. After decades of research there is still uncertainty in the clinical diagnosis of cancer and the identification of tumor-specific markers. Here we present a deep learning approach to cancer detection, and to the identification of genes critical for the diagnosis of breast cancer. First, we used Stacked Denoising Autoencoder (SDAE) to deeply extract functional features from high dimensional gene expression profiles. Next, we evaluated the performance of the extracted representation through supervised classification models to verify the usefulness of the new features in cancer detection. Lastly, we identified a set of highly interactive genes by analyzing the SDAE connectivity matrices. Our results and analysis illustrate that these highly interactive genes could be useful cancer biomarkers for the detection of breast cancer that deserve further studies.
由于基因表达数据的高维度和复杂性,从这些数据中进行癌症检测仍然是一项挑战。经过数十年的研究,癌症的临床诊断和肿瘤特异性标志物的识别仍然存在不确定性。在此,我们提出一种深度学习方法用于癌症检测以及识别对乳腺癌诊断至关重要的基因。首先,我们使用堆叠去噪自动编码器(SDAE)从高维基因表达谱中深度提取功能特征。接下来,我们通过监督分类模型评估提取表征的性能,以验证新特征在癌症检测中的有用性。最后,我们通过分析SDAE连通性矩阵识别出一组高度相互作用的基因。我们的结果和分析表明,这些高度相互作用的基因可能是用于检测乳腺癌的有用癌症生物标志物,值得进一步研究。