Buitrago-Duque Carlos, Tobón-Maya Heberley, Gómez-Ramírez Alejandra, Zapata-Valencia Samuel I, Lopera Maria J, Trujillo Carlos, Garcia-Sucerquia Jorge
Appl Opt. 2024 Mar 1;63(7):B49-B58. doi: 10.1364/AO.507412.
Among modern optical microscopy techniques, digital lensless holographic microscopy (DLHM) is one of the simplest label-free coherent imaging approaches. However, the hardware simplicity provided by the lensless configuration is often offset by the demanding computational postprocessing required to match the retrieved sample information to the user's expectations. A promising avenue to simplify this stage is the integration of artificial intelligence and machine learning (ML) solutions into the DLHM workflow. The biggest challenge to do so is the preparation of an extensive and high-quality experimental dataset of curated DLHM recordings to train ML models. In this work, a diverse, open-access dataset of DLHM recordings is presented as support for future research, contributing to the data needs of the applied research community. The database comprises 11,760 experimental DLHM holograms of bio and non-bio samples with diversity on the main recording parameters of the DLHM architecture. The database is divided into two datasets of 10 independent imaged samples. The first group, named multi-wavelength dataset, includes 8160 holograms and was recorded using laser diodes emitting at 654 nm, 510 nm, and 405 nm; the second group, named single-wavelength dataset, is composed of 3600 recordings and was acquired using a 633 nm He-Ne laser. All the experimental parameters related to the dataset acquisition, preparation, and calibration are described in this paper. The advantages of this large dataset are validated by re-training an existing autofocusing model for DLHM and as the training set for a simpler architecture that achieves comparable performance, proving its feasibility for improving existing ML-based models and the development of new ones.
在现代光学显微镜技术中,数字无透镜全息显微镜(DLHM)是最简单的无标记相干成像方法之一。然而,无透镜配置所带来的硬件简单性常常被为了使检索到的样本信息符合用户期望而需要的苛刻计算后处理所抵消。简化这一阶段的一个有前景的途径是将人工智能和机器学习(ML)解决方案集成到DLHM工作流程中。这样做的最大挑战是准备一个广泛且高质量的经过整理的DLHM记录实验数据集来训练ML模型。在这项工作中,展示了一个多样的、开放获取的DLHM记录数据集,以支持未来的研究,满足应用研究社区的数据需求。该数据库包含11760个生物和非生物样本的实验性DLHM全息图,在DLHM架构的主要记录参数方面具有多样性。该数据库被分为两个由10个独立成像样本组成的数据集。第一组名为多波长数据集,包括8160个全息图,是使用发射波长为654nm、510nm和405nm的激光二极管记录的;第二组名为单波长数据集,由3600个记录组成,是使用633nm的氦氖激光采集的。本文描述了与数据集采集、制备和校准相关的所有实验参数。通过重新训练现有的DLHM自动聚焦模型以及作为一个实现可比性能的更简单架构的训练集,验证了这个大数据集的优势,证明了其在改进现有基于ML的模型和开发新模型方面的可行性。