Suppr超能文献

一个用于深度学习模型的带有全切片图像的梭形细胞皮肤数据集。

A fusocelular skin dataset with whole slide images for deep learning models.

作者信息

Del Amor Rocío, López-Pérez Miguel, Meseguer Pablo, Morales Sandra, Terradez Liria, Aneiros-Fernandez Jose, Mateos Javier, Molina Rafael, Naranjo Valery

机构信息

Instituto Universitario de Investigación en Tecnología Centrada en el Ser Humano, HUMAN-tech Universitat Politècnica de València, Valencia, Spain.

Artikode Intelligence S.L, Valencia, Spain.

出版信息

Sci Data. 2025 May 14;12(1):788. doi: 10.1038/s41597-025-05108-3.

Abstract

Cutaneous spindle cell (CSC) lesions encompass a spectrum from benign to malignant neoplasms, often posing significant diagnostic challenges. Computer-aided diagnosis systems offer a promising solution to make pathologists' decisions objective and faster. These systems usually require large-scale datasets with curated labels for effective training; however, manual annotation is time-consuming and expensive. To overcome this challenge, crowdsourcing has emerged as a popular and valuable strategy to scale up the labeling process by distributing the effort among different non-expert annotators. This work introduces AI4SkIN, the first public dataset Whole Slide Images (WSIs) for CSC neoplasms, annotated using an innovative crowdsourcing protocol. AI4SkIN dataset contains 641 Hematoxylin and Eosin stained WSIs with multiclass labels from both expert and trainee pathologists. The dataset improves CSC neoplasm diagnosis using advanced machine learning and crowdsourcing based on Gaussian Processes, showing that models trained on non-expert labels perform comparably to those using expert labels. In conclusion, we illustrate that AI4SkIN provides a good resource for developing and validating methods for multiclass CSC neoplasm classification.

摘要

皮肤梭形细胞(CSC)病变涵盖了从良性到恶性肿瘤的一系列病变,常常带来重大的诊断挑战。计算机辅助诊断系统为使病理学家的诊断决策更加客观和快速提供了一个有前景的解决方案。这些系统通常需要带有精心策划标签的大规模数据集来进行有效训练;然而,手动标注既耗时又昂贵。为了克服这一挑战,众包已成为一种流行且有价值的策略,通过在不同的非专业标注者之间分配工作来扩大标注过程的规模。这项工作引入了AI4SkIN,这是首个用于CSC肿瘤的全切片图像(WSIs)公共数据集,它使用创新的众包协议进行标注。AI4SkIN数据集包含641张苏木精和伊红染色的WSIs,带有来自专家和实习病理学家的多类别标签。该数据集基于高斯过程,利用先进的机器学习和众包技术改进了CSC肿瘤的诊断,表明在非专家标签上训练的模型与使用专家标签训练的模型表现相当。总之,我们证明了AI4SkIN为开发和验证多类别CSC肿瘤分类方法提供了一个良好的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4012/12078617/952c7cad98c9/41597_2025_5108_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验