Ong Song-Quan, Mat Jalaluddin Nurzatil Sharleeza, Yong Kien Thai, Ong Su Ping, Lim Kooi Fong, Azhar Suhaila
Institute for Tropical Biology and Conservation (ITBC) Universiti Malaysia Sabah Kota Kinabalu Malaysia.
Centre for Research in Biotechnology for Agriculture (CEBAR) Universiti Malaya Kuala Lumpur Malaysia.
Ecol Evol. 2023 Jun 14;13(6):e10212. doi: 10.1002/ece3.10212. eCollection 2023 Jun.
Natural history museum collections are the most important sources of information on the present and past biodiversity of our planet. Most of the information is primarily stored in analogue form, and digitization of the collections can provide further open access to the images and specimen data to address the many global challenges. However, many museums do not digitize their collections because of constraints on budgets, human resources, and technologies. To encourage the digitization process, we present a guideline that offers low-cost and technical knowledge solutions yet balances the quality of the work and outcomes. The guideline describes three phases of digitization, namely preproduction, production, and postproduction. The preproduction phase includes human resource planning and selecting the highest priority collections for digitization. In the preproduction phase, a worksheet is provided for the digitizer to document the metadata, as well as a list of equipment needed to set up a digitizer station to image the specimens and associated labels. In the production phase, we place special emphasis on the light and color calibrations, as well as the guidelines for ISO/shutter speed/aperture to ensure a satisfactory quality of the digitized output. Once the specimen and labels have been imaged in the production phase, we demonstrate an end-to-end pipeline that uses optical character recognition (OCR) to transfer the physical text on the labels into a digital form and document it in a worksheet cell. A nationwide capacity workshop is then conducted to impart the guideline, and pre- and postcourse surveys were conducted to assess the confidence and skills acquired by the participants. This paper also discusses the challenges and future work that need to be taken forward for proper digital biodiversity data management.
自然历史博物馆的藏品是了解我们星球当前和过去生物多样性的最重要信息来源。大部分信息主要以模拟形式存储,而藏品数字化可以进一步提供对图像和标本数据的开放访问,以应对诸多全球挑战。然而,由于预算、人力资源和技术方面的限制,许多博物馆并未对其藏品进行数字化处理。为鼓励数字化进程,我们提出了一项指南,该指南提供低成本和技术知识解决方案,同时兼顾工作质量和成果。该指南描述了数字化的三个阶段,即前期制作、制作和后期制作。前期制作阶段包括人力资源规划以及选择数字化优先级最高的藏品。在前期制作阶段,会为数字化人员提供一份工作表,用于记录元数据,以及一份设置数字化工作站以对标本及其相关标签进行成像所需的设备清单。在制作阶段,我们特别强调光线和色彩校准,以及关于ISO/快门速度/光圈的指南,以确保数字化输出的质量令人满意。一旦在制作阶段对标本和标签进行了成像,我们展示了一个端到端的流程,该流程使用光学字符识别(OCR)将标签上的物理文本转换为数字形式,并记录在工作表单元格中。随后举办了一次全国性的能力研讨会来传授该指南,并进行了课程前后的调查,以评估参与者获得的信心和技能。本文还讨论了在进行适当的数字生物多样性数据管理方面需要面对的挑战和未来工作。