基于手工特征和学习特征的生物图像分类

Bioimage Classification with Handcrafted and Learned Features.

作者信息

Nanni Loris, Brahnam Sheryl, Ghidoni Stefano, Lumini Alessandra

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2018 Mar 30. doi: 10.1109/TCBB.2018.2821127.

DOI:10.1109/TCBB.2018.2821127

Abstract

Bioimage classification is increasingly becoming more important in many biological studies including those that require accurate cell phenotype recognition, subcellular localization, and histopathological classification. In this paper, we present a new General Purpose (GenP) bioimage classification method that can be applied to a large range of classification problems. The GenP system we propose is an ensemble that combines multiple texture features (both handcrafted and learned descriptors) for superior and generalizable discriminative power. Our ensemble obtains a boosting of performance by combining local features, dense sampling features, and deep learning features. Each descriptor is used to train a different Support Vector Machine that is then combined by sum rule. We evaluate our method on a diverse set of bioimage classification tasks each represented by a benchmark database, including some of those available in the IICBU 2008 database. Each bioimage classification task represents a typical subcellular, cellular, and tissue level classification problem. Our evaluation on these datasets demonstrates that the proposed GenP bioimage ensemble obtains state-of-the-art performance without any ad-hoc dataset tuning of the parameters (thereby avoiding any risk of overfitting/overtraining). To reproduce the experiments reported in this paper, the MATLAB code of all the descriptors is available at https://github.com/LorisNanni and https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0.

摘要

生物图像分类在许多生物学研究中变得越来越重要，包括那些需要精确的细胞表型识别、亚细胞定位和组织病理学分类的研究。在本文中，我们提出了一种新的通用（GenP）生物图像分类方法，该方法可应用于广泛的分类问题。我们提出的GenP系统是一个集成系统，它结合了多种纹理特征（手工制作的和学习到的描述符），以获得卓越且可推广的判别能力。我们的集成系统通过结合局部特征、密集采样特征和深度学习特征来提高性能。每个描述符用于训练一个不同的支持向量机，然后通过求和规则将其组合。我们在一组由基准数据库代表的多样化生物图像分类任务上评估我们的方法，包括IICBU 2008数据库中的一些任务。每个生物图像分类任务代表一个典型的亚细胞、细胞和组织水平的分类问题。我们对这些数据集的评估表明，所提出的GenP生物图像集成系统在没有对参数进行任何特定数据集调整的情况下获得了当前最优的性能（从而避免了任何过拟合/过度训练的风险）。为了重现本文中报告的实验，所有描述符的MATLAB代码可在https://github.com/LorisNanni和https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0获取。