Wührl Lorenz, Pylatiuk Christian, Giersch Matthias, Lapp Florian, von Rintelen Thomas, Balke Michael, Schmidt Stefan, Cerretti Pierfilippo, Meier Rudolf
Institute for Automation and Applied Informatics (IAI), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany.
Museum für Naturkunde, Center for Integrative Biodiversity Discovery, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Berlin, Germany.
Mol Ecol Resour. 2022 May;22(4):1626-1638. doi: 10.1111/1755-0998.13567. Epub 2021 Dec 23.
Invertebrate biodiversity remains poorly understood although it comprises much of the terrestrial animal biomass, most species and supplies many ecosystem services. The main obstacle is specimen-rich samples obtained with quantitative sampling techniques (e.g., Malaise trapping). Traditional sorting requires manual handling, while molecular techniques based on metabarcoding lose the association between individual specimens and sequences and thus struggle with obtaining precise abundance information. Here we present a sorting robot that prepares specimens from bulk samples for barcoding. It detects, images and measures individual specimens from a sample and then moves them into the wells of a 96-well microplate. We show that the images can be used to train convolutional neural networks (CNNs) that are capable of assigning the specimens to 14 insect taxa (usually families) that are particularly common in Malaise trap samples. The average assignment precision for all taxa is 91.4% (75%-100%). This ability of the robot to identify common taxa then allows for taxon-specific subsampling, because the robot can be instructed to only pick a prespecified number of specimens for abundant taxa. To obtain biomass information, the images are also used to measure specimen length and estimate body volume. We outline how the DiversityScanner can be a key component for tackling and monitoring invertebrate diversity by combining molecular and morphological tools: the images generated by the robot become training images for machine learning once they are labelled with taxonomic information from DNA barcodes. We suggest that a combination of automation, machine learning and DNA barcoding has the potential to tackle invertebrate diversity at an unprecedented scale.
尽管无脊椎动物生物多样性构成了大部分陆地动物生物量、包含了大多数物种并提供了许多生态系统服务,但人们对其了解仍然很少。主要障碍是通过定量采样技术(如马氏网诱捕)获得的富含标本的样本。传统的分类需要人工操作,而基于宏条形码的分子技术则失去了单个标本与序列之间的关联,因此难以获得精确的丰度信息。在此,我们展示了一种分拣机器人,它能从大量样本中制备用于条形码识别的标本。它能检测、成像并测量样本中的单个标本,然后将它们移入96孔微孔板的孔中。我们表明,这些图像可用于训练卷积神经网络(CNN),该网络能够将标本归类到14个昆虫分类单元(通常为科),这些分类单元在马氏网诱捕样本中尤为常见。所有分类单元的平均归类精度为91.4%(75%-100%)。机器人识别常见分类单元的这种能力随后允许进行特定分类单元的二次抽样,因为可以指示机器人只为丰富的分类单元挑选预先指定数量的标本。为了获得生物量信息,这些图像还用于测量标本长度和估计身体体积。我们概述了通过结合分子和形态学工具,“多样性扫描仪”如何成为解决和监测无脊椎动物多样性的关键组成部分:一旦机器人生成的图像用来自DNA条形码的分类信息进行标记,它们就会成为机器学习的训练图像。我们认为,自动化、机器学习和DNA条形码的结合有潜力以前所未有的规模解决无脊椎动物多样性问题。