Asaf Muhammad Zeeshan, Salam Anum Abdul, Khan Samavia, Musolff Noah, Akram Muhammad Usman, Rao Babar
Department of Computer and Software Engineering, College of Electrical and Mechanical Engineering, National University of Sciences and Technology, Islamabad 44000, Pakistan.
Centre for Dermatology, Rutgers Robert Wood Johnson Medical School, Somerset, NJ 08873, USA.
Data Brief. 2024 Oct 5;57:110997. doi: 10.1016/j.dib.2024.110997. eCollection 2024 Dec.
In the era of artificial intelligence and machine learning, computer-aided diagnostic frameworks are data-hungry and require large amounts of annotated data to automate the disease diagnosis procedure. Moreover, to enhance the performance and accuracy of disease diagnosis, procedures need to be automated to ensure timely and accurate diagnosis. We are providing a whole slide image repository comprising unstained skin biopsy images acquired using a brightfield microscope, along with Hematoxylin and Eosin chemically and virtually stained image samples, to virtualize the staining procedure and enhance the efficiency of the disease diagnosis pipeline. The dataset was utilized to train a Dual Contrastive GAN to generate virtually stained image samples. The trained model achieved an FID score of 80.47 between virtually stained and chemically stained image samples, indicating a high correlation of content between synthesized and original images. In contrast, FID scores of 342.01 and 320.40 were observed between unstained images and virtually stained slides, and between unstained images and chemically stained images, respectively, indicating less similarity in content.
在人工智能和机器学习时代,计算机辅助诊断框架对数据需求巨大,需要大量带注释的数据来实现疾病诊断程序的自动化。此外,为提高疾病诊断的性能和准确性,程序需要自动化以确保及时、准确的诊断。我们提供了一个全切片图像库,其中包括使用明场显微镜采集的未染色皮肤活检图像,以及苏木精和伊红化学染色及虚拟染色的图像样本,以实现染色过程的虚拟化并提高疾病诊断流程的效率。该数据集被用于训练双对比生成对抗网络(Dual Contrastive GAN)以生成虚拟染色的图像样本。训练后的模型在虚拟染色图像样本和化学染色图像样本之间的FID分数达到了80.47,表明合成图像与原始图像之间的内容具有高度相关性。相比之下,未染色图像与虚拟染色载玻片之间以及未染色图像与化学染色图像之间的FID分数分别为342.01和320.40,表明内容相似度较低。