School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China.
Anal Cell Pathol (Amst). 2022 Feb 27;2022:4376178. doi: 10.1155/2022/4376178. eCollection 2022.
Currently, the Thinprep cytologic test (TCT) is the most popular cervical cancer cytology test technique. It can detect precancerous conditions and microbial infections. However, this technique entirely relies on manual operation and doctors' naked eye observation, resulting in a heavy workload and low accuracy rate. Recently, automatic pathological diagnosis has been developed to solve this problem. Cervical cell classification is a key technology in the intelligent cervical cancer diagnosis system. Training a deep neural network-based classification model requires a large amount of data. However, cervical cell labeling requires specialized physicians and the cost of labeling is high, resulting in a lack of sufficient labeling data in this field. To address this problem, we propose a method to ensure high accuracy in cervical cell classification with a small amount of labeled data by introducing manual features and a voting mechanism to achieve data expansion in semi-supervised learning. The method consists of three main steps, using a clarity function to filter out high-quality cervical cell images, annotating a small amount of them, and balancing the training data using a voting mechanism. With a small amount of labeled data, the accuracy of the proposed method in this paper can reach to 91.94%.
目前,液基细胞学检测(TCT)是最受欢迎的宫颈癌细胞学检测技术。它可以检测癌前病变和微生物感染。然而,这种技术完全依赖于手动操作和医生肉眼观察,导致工作量大且准确率低。最近,自动病理诊断技术已经被开发出来以解决这个问题。宫颈细胞分类是智能宫颈癌诊断系统中的关键技术。训练基于深度学习的分类模型需要大量的数据。然而,宫颈细胞的标记需要专业医生,并且标记的成本很高,因此在这个领域缺乏足够的标记数据。为了解决这个问题,我们提出了一种方法,通过引入手动特征和投票机制,在半监督学习中实现数据扩展,用少量标记数据来保证宫颈细胞分类的高精度。该方法主要包括三个步骤,使用清晰函数筛选出高质量的宫颈细胞图像,对其中少量图像进行标注,并使用投票机制平衡训练数据。在少量标记数据的情况下,本文提出的方法的准确率可以达到 91.94%。