Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Incheon, 21985, South Korea.
IDLab, Ghent University, Technologiepark-Zwijnaarde 126, B-9052, Ghent, Belgium.
Sci Data. 2023 Oct 18;10(1):716. doi: 10.1038/s41597-023-02608-y.
Trypanosomiasis, a neglected tropical disease (NTD), challenges communities in sub-Saharan Africa and Latin America. The World Health Organization underscores the need for practical, field-adaptable diagnostics and rapid screening tools to address the negative impact of NTDs. While artificial intelligence has shown promising results in disease screening, the lack of curated datasets impedes progress. In response to this challenge, we developed the Tryp dataset, comprising microscopy images of unstained thick blood smears containing the Trypanosoma brucei brucei parasite. The Tryp dataset provides bounding box annotations for tightly enclosed regions containing the parasite for 3,085 positive images, and 93 images collected from negative blood samples. The Tryp dataset represents the largest of its kind. Furthermore, we provide a benchmark on three leading deep learning-based object detection techniques that demonstrate the feasibility of AI for this task. Overall, the availability of the Tryp dataset is expected to facilitate research advancements in diagnostic screening for this disease, which may lead to improved healthcare outcomes for the communities impacted.
译文: 昏睡病是一种被忽视的热带病(NTD),给撒哈拉以南非洲和拉丁美洲的社区带来了挑战。世界卫生组织强调需要实用的、适用于现场的诊断方法和快速筛查工具,以应对 NTD 的负面影响。虽然人工智能在疾病筛查方面显示出了有希望的结果,但缺乏经过整理的数据集阻碍了进展。为了应对这一挑战,我们开发了 Tryp 数据集,其中包含未染色的厚血涂片的显微镜图像,其中包含布氏锥虫布鲁斯氏寄生虫。Tryp 数据集提供了 3085 张阳性图像和 93 张来自阴性血液样本的图像中紧密包围寄生虫的区域的边界框注释。Tryp 数据集是同类中最大的数据集。此外,我们还提供了基于三种领先的深度学习目标检测技术的基准,这些技术证明了人工智能在这项任务中的可行性。总的来说,Tryp 数据集的可用性有望促进该疾病诊断筛查方面的研究进展,从而改善受影响社区的医疗保健结果。