Macías-García Laura, Luna-Romera José María, García-Gutiérrez Jorge, Martínez-Ballesteros María, Riquelme-Santos José C, González-Cámpora Ricardo
Department of Anatomic Pathology. Hospital Infanta Elena, Huelva, Spain.
Department of Computer Languages and Systems, ETSII, University of Seville, Spain.
J Biomed Inform. 2017 Aug;72:33-44. doi: 10.1016/j.jbi.2017.06.020. Epub 2017 Jun 27.
Breast cancer is the most common cause of cancer death in women. Today, post-transcriptional protein products of the genes involved in breast cancer can be identified by immunohistochemistry. However, this method has problems arising from the intra-observer and inter-observer variability in the assessment of pathologic variables, which may result in misleading conclusions. Using an optimal selection of preprocessing techniques may help to reduce observer variability. Deep learning has emerged as a powerful technique for any tasks related to machine learning such as classification and regression. The aim of this work is to use autoencoders (neural networks commonly used to feed deep learning architectures) to improve the quality of the data for developing immunohistochemistry signatures with prognostic value in breast cancer. Our testing on data from 222 patients with invasive non-special type breast carcinoma shows that an automatic binarization of experimental data after autoencoding could outperform other classical preprocessing techniques (such as human-dependent or automatic binarization only) when applied to the prognosis of breast cancer by immunohistochemical signatures.
乳腺癌是女性癌症死亡的最常见原因。如今,参与乳腺癌的基因的转录后蛋白质产物可通过免疫组织化学来识别。然而,该方法存在问题,即在病理变量评估中存在观察者内和观察者间的变异性,这可能导致误导性结论。使用最佳的预处理技术选择可能有助于减少观察者变异性。深度学习已成为与机器学习相关的任何任务(如分类和回归)的强大技术。这项工作的目的是使用自动编码器(常用于为深度学习架构提供数据的神经网络)来提高数据质量,以开发具有乳腺癌预后价值的免疫组织化学特征。我们对222例浸润性非特殊类型乳腺癌患者的数据测试表明,自动编码后对实验数据进行自动二值化,在通过免疫组织化学特征应用于乳腺癌预后时,可能优于其他经典预处理技术(如仅依赖人工或自动二值化)。