Sado Innocent Tatchum, Fitime Louis Fippo, Pelap Geraud Fokou, Tinku Claude, Meudje Gaelle Mireille, Bouetou Thomas Bouetou
Laboratory of Information System and Signal Processing, National Advanced School of Engineering Yaounde, Department of Computer Engineering, University of Yaounde I, Yaounde, Cameroon.
Laboratory of Information System and Signal Processing, National Advanced School of Engineering Yaounde, Department of Computer Engineering, University of Yaounde I, Yaounde, Cameroon; Smart Digital Strategy SARL Company, Yaounde, Cameroon.
J Biomed Inform. 2024 Dec;160:104751. doi: 10.1016/j.jbi.2024.104751. Epub 2024 Nov 19.
Cancer is a disease that causes many deaths worldwide. The treatment of cancer is first and foremost a matter of detection, a treatment that is most effective when the disease is detected at an early stage. With the evolution of technology, several computer-aided diagnosis tools have been developed around cancer; several image-based cancer detection methods have been developed too. However, cancer detection faces many difficulties related to early detection which is crucial for patient survival rate. To detect cancer early, scientists have been using transcriptomic data. However, this presents some challenges such as unlabeled data, a large amount of data, and image-based techniques that only focus on one type of cancer. The purpose of this work is to develop a deep learning model that can effectively detect as soon as possible, specifically in the early stages, any type of cancer as an anomaly in transcriptomic data. This model must have the ability to act independently and not be restricted to any specific type of cancer. To achieve this goal, we modeled a deep neural network (a Variational Autoencoder) and then defined an algorithm for detecting anomalies in the output of the Variational Autoencoder. The Variational Autoencoder consists of an encoder and a decoder with a hidden layer. With the TCGA and GTEx data, we were able to train the model for six types of cancer using the Adam optimizer with decay learning for training, and a two-component loss function. As a result, we obtained the lowest value of accuracy 0.950, and the lowest value of recall 0.830. This research leads us to the design of a deep learning model for the detection of cancer as an anomaly in transcriptomic data.
癌症是一种在全球导致众多死亡的疾病。癌症治疗首先是一个检测问题,当疾病在早期被检测到时,治疗最为有效。随着技术的发展,围绕癌症开发了多种计算机辅助诊断工具;也开发了几种基于图像的癌症检测方法。然而,癌症检测面临许多与早期检测相关的困难,而早期检测对患者生存率至关重要。为了早期检测癌症,科学家们一直在使用转录组数据。然而,这带来了一些挑战,如未标记数据、大量数据以及仅专注于一种癌症类型的基于图像的技术。这项工作的目的是开发一种深度学习模型,能够尽快有效地检测出转录组数据中的任何类型癌症,特别是在早期阶段,将其作为异常情况检测出来。该模型必须具备独立运行的能力,且不受限于任何特定类型的癌症。为实现这一目标,我们构建了一个深度神经网络(变分自编码器),然后定义了一种算法来检测变分自编码器输出中的异常。变分自编码器由一个编码器和一个带有隐藏层的解码器组成。利用TCGA和GTEx数据,我们能够使用带有衰减学习的Adam优化器进行训练,并采用双分量损失函数,针对六种癌症类型训练该模型。结果,我们获得了最低准确率0.950和最低召回率0.830。这项研究引导我们设计出一种用于在转录组数据中检测癌症异常情况的深度学习模型。