Ibris, Inc., Monmouth Junction, New Jersey, USA.
BMC Bioinformatics. 2012 Oct 30;13:282. doi: 10.1186/1471-2105-13-282.
Automated classification of histopathology involves identification of multiple classes, including benign, cancerous, and confounder categories. The confounder tissue classes can often mimic and share attributes with both the diseased and normal tissue classes, and can be particularly difficult to identify, both manually and by automated classifiers. In the case of prostate cancer, they may be several confounding tissue types present in a biopsy sample, posing as major sources of diagnostic error for pathologists. Two common multi-class approaches are one-shot classification (OSC), where all classes are identified simultaneously, and one-versus-all (OVA), where a "target" class is distinguished from all "non-target" classes. OSC is typically unable to handle discrimination of classes of varying similarity (e.g. with images of prostate atrophy and high grade cancer), while OVA forces several heterogeneous classes into a single "non-target" class. In this work, we present a cascaded (CAS) approach to classifying prostate biopsy tissue samples, where images from different classes are grouped to maximize intra-group homogeneity while maximizing inter-group heterogeneity.
We apply the CAS approach to categorize 2000 tissue samples taken from 214 patient studies into seven classes: epithelium, stroma, atrophy, prostatic intraepithelial neoplasia (PIN), and prostate cancer Gleason grades 3, 4, and 5. A series of increasingly granular binary classifiers are used to split the different tissue classes until the images have been categorized into a single unique class. Our automatically-extracted image feature set includes architectural features based on location of the nuclei within the tissue sample as well as texture features extracted on a per-pixel level. The CAS strategy yields a positive predictive value (PPV) of 0.86 in classifying the 2000 tissue images into one of 7 classes, compared with the OVA (0.77 PPV) and OSC approaches (0.76 PPV).
Use of the CAS strategy increases the PPV for a multi-category classification system over two common alternative strategies. In classification problems such as histopathology, where multiple class groups exist with varying degrees of heterogeneity, the CAS system can intelligently assign class labels to objects by performing multiple binary classifications according to domain knowledge.
组织病理学的自动分类涉及到多个类别的识别,包括良性、恶性和混杂类别。混杂组织类别通常可以模仿和共享与病变组织和正常组织类别的属性,并且无论是手动还是通过自动分类器都特别难以识别。在前列腺癌的情况下,活检样本中可能存在几种混杂组织类型,它们可能成为病理学家诊断错误的主要来源。两种常见的多类方法是一次性分类(OSC),其中所有类别同时被识别,以及一对多(OVA),其中一个“目标”类别与所有“非目标”类别区分开来。OSC 通常无法处理不同相似性类别的区分(例如,前列腺萎缩和高级别癌症的图像),而 OVA 则将几个异构类强制归入单个“非目标”类。在这项工作中,我们提出了一种级联(CAS)方法来对前列腺活检组织样本进行分类,其中来自不同类别的图像被分组以最大化组内同质性,同时最大化组间异质性。
我们应用 CAS 方法将 214 项研究中的 2000 个组织样本分为七类:上皮、基质、萎缩、前列腺上皮内瘤变(PIN)和前列腺癌 Gleason 分级 3、4 和 5。一系列越来越细粒度的二进制分类器用于分割不同的组织类别,直到图像被分类为单个唯一类别。我们自动提取的图像特征集包括基于细胞核在组织样本中的位置的结构特征以及逐像素提取的纹理特征。与 OVA(0.77 PPV)和 OSC 方法(0.76 PPV)相比,CAS 策略在将 2000 个组织图像分类为 7 个类别之一时,产生了 0.86 的阳性预测值(PPV)。
在组织病理学等多类别分类系统中,与两种常见的替代策略相比,使用 CAS 策略可以提高多类别分类系统的阳性预测值。在存在具有不同程度异质性的多个类别组的分类问题中,CAS 系统可以根据领域知识通过执行多次二进制分类来智能地为对象分配类别标签。