Del Amor Rocío, López-Pérez Miguel, Meseguer Pablo, Morales Sandra, Terradez Liria, Aneiros-Fernandez Jose, Mateos Javier, Molina Rafael, Naranjo Valery
Instituto Universitario de Investigación en Tecnología Centrada en el Ser Humano, HUMAN-tech Universitat Politècnica de València, Valencia, Spain.
Artikode Intelligence S.L, Valencia, Spain.
Sci Data. 2025 May 14;12(1):788. doi: 10.1038/s41597-025-05108-3.
Cutaneous spindle cell (CSC) lesions encompass a spectrum from benign to malignant neoplasms, often posing significant diagnostic challenges. Computer-aided diagnosis systems offer a promising solution to make pathologists' decisions objective and faster. These systems usually require large-scale datasets with curated labels for effective training; however, manual annotation is time-consuming and expensive. To overcome this challenge, crowdsourcing has emerged as a popular and valuable strategy to scale up the labeling process by distributing the effort among different non-expert annotators. This work introduces AI4SkIN, the first public dataset Whole Slide Images (WSIs) for CSC neoplasms, annotated using an innovative crowdsourcing protocol. AI4SkIN dataset contains 641 Hematoxylin and Eosin stained WSIs with multiclass labels from both expert and trainee pathologists. The dataset improves CSC neoplasm diagnosis using advanced machine learning and crowdsourcing based on Gaussian Processes, showing that models trained on non-expert labels perform comparably to those using expert labels. In conclusion, we illustrate that AI4SkIN provides a good resource for developing and validating methods for multiclass CSC neoplasm classification.
皮肤梭形细胞(CSC)病变涵盖了从良性到恶性肿瘤的一系列病变,常常带来重大的诊断挑战。计算机辅助诊断系统为使病理学家的诊断决策更加客观和快速提供了一个有前景的解决方案。这些系统通常需要带有精心策划标签的大规模数据集来进行有效训练;然而,手动标注既耗时又昂贵。为了克服这一挑战,众包已成为一种流行且有价值的策略,通过在不同的非专业标注者之间分配工作来扩大标注过程的规模。这项工作引入了AI4SkIN,这是首个用于CSC肿瘤的全切片图像(WSIs)公共数据集,它使用创新的众包协议进行标注。AI4SkIN数据集包含641张苏木精和伊红染色的WSIs,带有来自专家和实习病理学家的多类别标签。该数据集基于高斯过程,利用先进的机器学习和众包技术改进了CSC肿瘤的诊断,表明在非专家标签上训练的模型与使用专家标签训练的模型表现相当。总之,我们证明了AI4SkIN为开发和验证多类别CSC肿瘤分类方法提供了一个良好的资源。