Li Wanying, Yang Linhe, Peng Guobei, Pang Guangyao, Yu Zhenming, Zhu Xiaoying
Guangxi Colleges and Universities Key Laboratory of Intelligent Software, Wuzhou University, Wuzhou, 543002, China.
Guangxi Key Laboratory of Machine Vision and Intelligent Control, Wuzhou University, Wuzhou, 543002, China.
Sci Rep. 2025 Mar 25;15(1):10247. doi: 10.1038/s41598-025-93954-x.
There are too many types of Chinese medicinal herbs (CMH) and it is difficult to collect microscopic images, which naturally leads to the problem of small sample size. In addition, CMH also has some scarcity characteristic, with the proportion of certain cells as low as 0.5%. This leads to the failure of deep learning models, and even few-shot learning methods are difficult to solve effectively. Expanding the data scale of rare features is one of the effective strategies. To address this challenge, we propose an effective microscopic image augmentation approach for few-shot learning (MIAA-FSL). The approach consists of two aspects: first, we design the conditionally guided microscopic image generation model (CGMIGM), which combines the denoising diffusion probabilistic models (DDPM) based conditional guidance technique to efficiently generate rare features and thus alleviate the class imbalance problem. Second, we introduce the semi-supervised learning data augmentation model (SSLDAM), which integrates semi-supervised image processing and pseudo-label generation techniques to effectively overcome the issues of damage, blurriness, and difficulty in discernment in microscopic images, making otherwise unusable images usable. The experimental results show that the MIAA-FSL improves the identification accuracy by 24% on average compared with the Microscope Image Recognition + DDPM (MIR+DDPM) approach, especially in the identification of rare features, the accuracy is significantly improved from 45.5% to 87.0%, which effectively mitigates the problem of object detection with few samples.
中药材的种类繁多,难以采集微观图像,这自然导致了样本量小的问题。此外,中药材还具有一些稀缺特性,某些细胞的比例低至0.5%。这导致深度学习模型失效,即使是少样本学习方法也难以有效解决。扩大稀有特征的数据规模是有效的策略之一。为应对这一挑战,我们提出了一种用于少样本学习的有效微观图像增强方法(MIAA-FSL)。该方法包括两个方面:第一,我们设计了条件引导微观图像生成模型(CGMIGM),它结合了基于去噪扩散概率模型(DDPM)的条件引导技术,以高效生成稀有特征从而缓解类别不平衡问题。第二,我们引入了半监督学习数据增强模型(SSLDAM),它集成了半监督图像处理和伪标签生成技术,以有效克服微观图像中的损伤、模糊和辨别困难问题,使原本无法使用的图像变得可用。实验结果表明,与显微镜图像识别+DDPM(MIR+DDPM)方法相比,MIAA-FSL平均将识别准确率提高了24%,特别是在稀有特征的识别方面,准确率从45.5%显著提高到87.0%,有效缓解了少样本目标检测问题。