Schuettpelz Eric, Frandsen Paul B, Dikow Rebecca B, Brown Abel, Orli Sylvia, Peters Melinda, Metallo Adam, Funk Vicki A, Dorr Laurence J
National Museum of Natural History, Smithsonian Institution, Washington, DC, United States of America.
Office of the Chief Information Officer, Smithsonian Institution, Washington, DC, United States of America.
Biodivers Data J. 2017 Nov 2(5):e21139. doi: 10.3897/BDJ.5.e21139. eCollection 2017.
Natural history collections contain data that are critical for many scientific endeavors. Recent efforts in mass digitization are generating large datasets from these collections that can provide unprecedented insight. Here, we present examples of how deep convolutional neural networks can be applied in analyses of imaged herbarium specimens. We first demonstrate that a convolutional neural network can detect mercury-stained specimens across a collection with 90% accuracy. We then show that such a network can correctly distinguish two morphologically similar plant families 96% of the time. Discarding the most challenging specimen images increases accuracy to 94% and 99%, respectively. These results highlight the importance of mass digitization and deep learning approaches and reveal how they can together deliver powerful new investigative tools.
自然历史藏品包含对许多科学研究至关重要的数据。近期大规模数字化的努力正在从这些藏品中生成大型数据集,从而提供前所未有的见解。在此,我们展示了深度卷积神经网络如何应用于对成像的植物标本馆标本的分析。我们首先证明,卷积神经网络能够以90%的准确率在整个藏品中检测出汞染色的标本。然后我们表明,这样的网络能够在96%的时间内正确区分两个形态相似的植物科。丢弃最具挑战性的标本图像后,准确率分别提高到94%和99%。这些结果突出了大规模数字化和深度学习方法的重要性,并揭示了它们如何共同提供强大的新研究工具。