Photonics Group, Physics Department, Imperial College London, London, U.K.
Northwest Institute of Nuclear Technology, Xi'an, Shaanxi, P.R. China.
J Microsc. 2019 Feb;273(2):148-160. doi: 10.1111/jmi.12772. Epub 2018 Dec 3.
Super-resolved microscopy techniques have revolutionized the ability to study biological structures below the diffraction limit. Single molecule localization microscopy (SMLM) techniques are widely used because they are relatively straightforward to implement and can be realized at relatively low cost, e.g. compared to laser scanning microscopy techniques. However, while the data analysis can be readily undertaken using open source or other software tools, large SMLM data volumes and the complexity of the algorithms used often lead to long image data processing times that can hinder the iterative optimization of experiments. There is increasing interest in high throughput SMLM, but its further development and application is inhibited by the data processing challenges. We present here a widely applicable approach to accelerating SMLM data processing via a parallelized implementation of ThunderSTORM on a high-performance computing (HPC) cluster and quantify the speed advantage for a four-node cluster (with 24 cores and 128 GB RAM per node) compared to a high specification (28 cores, 128 GB RAM, SSD-enabled) desktop workstation. This data processing speed can be readily scaled by accessing more HPC resources. Our approach is not specific to ThunderSTORM and can be adapted for a wide range of SMLM software. LAY DESCRIPTION: Optical microscopy is now able to provide images with a resolution far beyond the diffraction limit thanks to relatively new super-resolved microscopy (SRM) techniques, which have revolutionized the ability to study biological structures. One approach to SRM is to randomly switch on and off the emission of fluorescent molecules in an otherwise conventional fluorescence microscope. If only a sparse subset of the fluorescent molecules labelling a sample can be switched on at a time, then each emitter will be, on average, spaced further apart than the diffraction-limited resolution of the conventional microscope and the separate bright spots in the image corresponding to each emitter can be localised to high precision by finding the centre of each feature using a computer program. Thus, a precise map of the emitter positions can be recorded by sequentially mapping the localisation of different subsets of emitters as they are switched on and others switched off. Typically, this approach, described as single molecule localisation microscopy (SMLM), results in large image data sets that can take many minutes to hours to process, depending on the size of the field of view and whether the SMLM analysis employs a computationally-intensive iterative algorithm. Such a slow workflow makes it difficult to optimise experiments and to analyse large numbers of samples. Faster SMLM experiments would be generally useful and automated high throughput SMLM studies of arrays of samples, such as cells, could be applied to drug discovery and other applications. However, the time required to process the resulting data would be prohibitive on a normal computer. To address this, we have developed a method to run standard SMLM data analysis software tools in parallel on a high-performance computing cluster (HPC). This can be used to accelerate the analysis of individual SMLM experiments or it can be scaled to analyse high throughput SMLM data by extending it to run on an arbitrary number of HPC processors in parallel. In this paper we outline the design of our parallelised SMLM software for HPC and quantify the speed advantage when implementing it on four HPC nodes compared to a powerful desktop computer.
超分辨率显微镜技术使得在衍射极限以下研究生物结构的能力发生了革命性变化。单分子定位显微镜(SMLM)技术被广泛应用,因为它们相对简单易于实现,并且成本相对较低,例如与激光扫描显微镜技术相比。然而,虽然可以使用开源或其他软件工具轻松进行数据分析,但 SMLM 数据量庞大且使用的算法复杂,这通常会导致图像数据处理时间过长,从而阻碍实验的迭代优化。人们对高通量 SMLM 的兴趣日益浓厚,但由于数据处理方面的挑战,其进一步发展和应用受到了限制。我们在这里提出了一种广泛适用的方法,通过在高性能计算(HPC)集群上并行化 ThunderSTORM 来加速 SMLM 数据处理,并定量比较了具有四个节点(每个节点具有 24 个内核和 128GB RAM)的四节点集群与具有高规格(28 个内核,128GB RAM,启用 SSD)的台式工作站相比的速度优势。可以通过访问更多 HPC 资源来轻松扩展此数据处理速度。我们的方法不仅特定于 ThunderSTORM,而且可以适应广泛的 SMLM 软件。
光学显微镜现在能够提供分辨率远远超过衍射极限的图像,这要归功于相对较新的超分辨率显微镜(SRM)技术,这些技术彻底改变了研究生物结构的能力。一种 SRM 方法是在传统荧光显微镜中随机打开和关闭荧光分子的发射。如果一次只能打开样品中荧光分子的稀疏子集,则每个发射器的平均间隔将大于传统显微镜的衍射极限分辨率,并且图像中对应于每个发射器的单独亮点可以通过找到每个特征的中心来高精度地定位使用计算机程序。因此,可以通过顺序映射不同子集的发射器的定位来记录发射器位置的精确地图,当它们被打开而其他发射器被关闭时。通常,这种方法,称为单分子定位显微镜(SMLM),会产生非常大的图像数据集,具体取决于视场的大小以及 SMLM 分析是否采用计算密集型迭代算法,这些数据集可能需要数分钟到数小时才能处理。这样缓慢的工作流程使得优化实验和分析大量样本变得困难。更快的 SMLM 实验通常会很有用,并且可以对细胞等样本的高通量 SMLM 研究进行自动化,从而可以应用于药物发现和其他应用。然而,处理由此产生的数据所需的时间在普通计算机上是不可行的。为了解决这个问题,我们开发了一种在高性能计算集群(HPC)上并行运行标准 SMLM 数据分析软件工具的方法。这可用于加速单个 SMLM 实验的分析,或者通过将其扩展到并行运行任意数量的 HPC 处理器来扩展到高通量 SMLM 数据的分析。在本文中,我们概述了我们用于 HPC 的并行 SMLM 软件的设计,并量化了在四个 HPC 节点上实现它与功能强大的台式计算机相比的速度优势。