Suppr超能文献

利用从博物馆标本中生成的合成数据来训练适应性强的物种分类模型。

Leveraging synthetic data produced from museum specimens to train adaptable species classification models.

作者信息

Blair Jarrett D, Khidas Kamal, Marshall Katie E

机构信息

Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada.

Department of Ecoscience, Aarhus University, Aarhus, Denmark.

出版信息

PLoS One. 2025 Sep 3;20(9):e0329482. doi: 10.1371/journal.pone.0329482. eCollection 2025.

Abstract

Computer vision has increasingly shown potential to improve data processing efficiency in ecological research. However, training computer vision models requires large amounts of high-quality, annotated training data. This poses a significant challenge for researchers looking to create bespoke computer vision models, as substantial human resources and biological replicates are often needed to adequately train these models. Synthetic images have been proposed as a potential solution for generating large training datasets, but models trained with synthetic images often have poor generalization to real photographs. Here we present a modular pipeline for training generalizable classification models using synthetic images. Our pipeline includes 3D asset creation with the use of 3D scanners, synthetic image generation with open-source computer graphic software, and domain adaptive classification model training. We demonstrate our pipeline by applying it to skulls of 16 mammal species in the order Carnivora. We explore several domain adaptation techniques, including maximum mean discrepancy (MMD) loss, fine-tuning, and data supplementation. Using our pipeline, we were able to improve classification accuracy on real photographs from 55.4% to a maximum of 95.1%. We also conducted qualitative analysis with t-distributed stochastic neighbor embedding (t-SNE) and gradient-weighted class activation mapping (Grad-CAM) to compare different domain adaptation techniques. Our results demonstrate the feasibility of using synthetic images for ecological computer vision and highlight the potential of museum specimens and 3D assets for scalable, generalizable model training.

摘要

计算机视觉在提高生态研究中的数据处理效率方面日益显示出潜力。然而,训练计算机视觉模型需要大量高质量的带注释训练数据。这对希望创建定制计算机视觉模型的研究人员构成了重大挑战,因为通常需要大量人力资源和生物复制品才能充分训练这些模型。合成图像已被提议作为生成大型训练数据集的一种潜在解决方案,但使用合成图像训练的模型对真实照片的泛化能力往往较差。在此,我们提出了一种使用合成图像训练可泛化分类模型的模块化流程。我们的流程包括使用3D扫描仪创建3D资产、使用开源计算机图形软件生成合成图像以及进行域自适应分类模型训练。我们通过将其应用于食肉目16种哺乳动物的头骨来展示我们的流程。我们探索了几种域自适应技术,包括最大均值差异(MMD)损失、微调以及数据补充。使用我们的流程,我们能够将真实照片上的分类准确率从55.4%提高到最高95.1%。我们还使用t分布随机邻域嵌入(t-SNE)和梯度加权类激活映射(Grad-CAM)进行了定性分析,以比较不同的域自适应技术。我们的结果证明了使用合成图像进行生态计算机视觉的可行性,并突出了博物馆标本和3D资产在可扩展、可泛化模型训练方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47a4/12407421/35d768b01e3f/pone.0329482.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验