• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习训练数据:超过50万张带有物种标签的蝴蝶和飞蛾(鳞翅目)图像。

Machine learning training data: over 500,000 images of butterflies and moths (Lepidoptera) with species labels.

作者信息

Barkmann Friederike, Lindner Andreas, Würflinger Ronald, Höttinger Helmut, Rüdisser Johannes

机构信息

Department of Ecology, University of Innsbruck, Innsbruck, Austria.

Advanced Computing Austria ACA GmbH, Wien, Austria.

出版信息

Sci Data. 2025 Aug 6;12(1):1369. doi: 10.1038/s41597-025-05708-z.

DOI:10.1038/s41597-025-05708-z
PMID:40770239
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12329013/
Abstract

Deep learning models can accelerate the processing of image-based biodiversity data and provide educational value by giving direct feedback to citizen scientists. However, the training of such models requires large amounts of labelled data and not all species are equally suited for identification from images alone. Most butterfly and many moth species (Lepidoptera) which play an important role as biodiversity indicators are well-suited for such approaches. This dataset contains over 540.000 images of 185 butterfly and moth species that occur in Austria. Images were collected by citizen scientists with the application "Schmetterlinge Österreichs" and correct species identification was ensured by an experienced entomologist. The number of images per species ranges from one to nearly 30.000. Such a strong class imbalance is common in datasets of species records. The dataset is larger than other published dataset of butterfly and moth images and offers opportunities for the training and evaluation of machine learning models on the fine-grained classification task of species identification.

摘要

深度学习模型可以加速基于图像的生物多样性数据的处理,并通过向公民科学家提供直接反馈来提供教育价值。然而,此类模型的训练需要大量的标记数据,而且并非所有物种都同样适合仅从图像中进行识别。作为生物多样性指标发挥重要作用的大多数蝴蝶和许多蛾类物种(鳞翅目)非常适合此类方法。该数据集包含奥地利境内出现的185种蝴蝶和蛾类物种的超过540000张图像。这些图像由公民科学家通过“奥地利蝴蝶”应用程序收集,并由一位经验丰富的昆虫学家确保物种识别正确。每个物种的图像数量从1张到近30000张不等。这种强烈的类别不平衡在物种记录数据集中很常见。该数据集比其他已发布的蝴蝶和蛾类图像数据集更大,并为机器学习模型在物种识别的细粒度分类任务上的训练和评估提供了机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/9510649bc62d/41597_2025_5708_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/7b0b72614e69/41597_2025_5708_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/74477c93ddf1/41597_2025_5708_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/8a6abfa8dec5/41597_2025_5708_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/f4cf70723710/41597_2025_5708_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/97fdc0e95088/41597_2025_5708_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/9510649bc62d/41597_2025_5708_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/7b0b72614e69/41597_2025_5708_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/74477c93ddf1/41597_2025_5708_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/8a6abfa8dec5/41597_2025_5708_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/f4cf70723710/41597_2025_5708_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/97fdc0e95088/41597_2025_5708_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d988/12329013/9510649bc62d/41597_2025_5708_Fig6_HTML.jpg

相似文献

1
Machine learning training data: over 500,000 images of butterflies and moths (Lepidoptera) with species labels.机器学习训练数据:超过50万张带有物种标签的蝴蝶和飞蛾(鳞翅目)图像。
Sci Data. 2025 Aug 6;12(1):1369. doi: 10.1038/s41597-025-05708-z.
2
Sexual Harassment and Prevention Training性骚扰与预防培训
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
5
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
6
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
7
Short-Term Memory Impairment短期记忆障碍
8
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
9
A medical image classification method based on self-regularized adversarial learning.基于自正则化对抗学习的医学图像分类方法。
Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.
10
Noise-aware system generative model (NASGM): positron emission tomography (PET) image simulation framework with observer validation studies.噪声感知系统生成模型(NASGM):用于正电子发射断层扫描(PET)图像模拟框架及观察者验证研究。
Med Phys. 2025 Jul;52(7):e17962. doi: 10.1002/mp.17962.

本文引用的文献

1
Deep learning in terrestrial conservation biology.深度学习在陆地保护生物学中的应用。
Biol Futur. 2023 Dec;74(4):359-367. doi: 10.1007/s42977-023-00200-4. Epub 2024 Jan 16.
2
Recognizability bias in citizen science photographs.公民科学照片中的可识别性偏差。
R Soc Open Sci. 2023 Feb 1;10(2):221063. doi: 10.1098/rsos.221063. eCollection 2023 Feb.
3
Towards the fully automated monitoring of ecological communities.迈向生态群落的全自动监测。
Ecol Lett. 2022 Dec;25(12):2753-2775. doi: 10.1111/ele.14123. Epub 2022 Oct 20.
4
Perspectives in machine learning for wildlife conservation.机器学习在野生动物保护中的应用展望。
Nat Commun. 2022 Feb 9;13(1):792. doi: 10.1038/s41467-022-27980-y.
5
Artificial Intelligence Meets Citizen Science to Supercharge Ecological Monitoring.人工智能与公民科学相结合,助力生态监测加速发展。
Patterns (N Y). 2020 Oct 9;1(7):100109. doi: 10.1016/j.patter.2020.100109.
6
Species-level image classification with convolutional neural network enables insect identification from habitus images.基于卷积神经网络的物种级图像分类可实现从昆虫形态图像中识别昆虫。
Ecol Evol. 2019 Dec 24;10(2):737-747. doi: 10.1002/ece3.5921. eCollection 2020 Jan.
7
An updated checklist of the European Butterflies (Lepidoptera, Papilionoidea).《欧洲蝴蝶(鳞翅目,凤蝶总科)更新名录》
Zookeys. 2018 Dec 31(811):9-45. doi: 10.3897/zookeys.811.28712. eCollection 2018.
8
A systematic study of the class imbalance problem in convolutional neural networks.卷积神经网络中类不平衡问题的系统研究。
Neural Netw. 2018 Oct;106:249-259. doi: 10.1016/j.neunet.2018.07.011. Epub 2018 Jul 29.
9
Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning.利用深度学习自动识别、计数和描述相机陷阱图像中的野生动物。
Proc Natl Acad Sci U S A. 2018 Jun 19;115(25):E5716-E5725. doi: 10.1073/pnas.1719367115. Epub 2018 Jun 5.
10
Monitoring change in the abundance and distribution of insects using butterflies and other indicator groups.利用蝴蝶和其他指示物种监测昆虫数量及分布的变化。
Philos Trans R Soc Lond B Biol Sci. 2005 Feb 28;360(1454):339-57. doi: 10.1098/rstb.2004.1585.