Center for Bio/Molecular Science and Engineering Code 6900, U.S. Naval Research Laboratory, Washington, DC, 20375, USA.
Department of Bioengineering, Volgenau School of Engineering, George Mason University, Fairfax, VA, 22030, USA.
Sci Rep. 2022 Mar 9;12(1):3871. doi: 10.1038/s41598-022-07759-3.
The intra-image identification of DNA structures is essential to rapid prototyping and quality control of self-assembled DNA origami scaffold systems. We postulate that the YOLO modern object detection platform commonly used for facial recognition can be applied to rapidly scour atomic force microscope (AFM) images for identifying correctly formed DNA nanostructures with high fidelity. To make this approach widely available, we use open-source software and provide a straightforward procedure for designing a tailored, intelligent identification platform which can easily be repurposed to fit arbitrary structural geometries beyond AFM images of DNA structures. Here, we describe methods to acquire and generate the necessary components to create this robust system. Beginning with DNA structure design, we detail AFM imaging, data point annotation, data augmentation, model training, and inference. To demonstrate the adaptability of this system, we assembled two distinct DNA origami architectures (triangles and breadboards) for detection in raw AFM images. Using the images acquired of each structure, we trained two separate single class object identification models unique to each architecture. By applying these models in sequence, we correctly identified 3470 structures from a total population of 3617 using images that sometimes included a third DNA origami structure as well as other impurities. Analysis was completed in under 20 s with results yielding an F1 score of 0.96 using our approach.
DNA 结构的同图识别对于自组装 DNA 折纸支架系统的快速原型制作和质量控制至关重要。我们假设,常用于人脸识别的 YOLO 现代对象检测平台可以应用于快速扫描原子力显微镜 (AFM) 图像,以高保真度识别正确形成的 DNA 纳米结构。为了使这种方法广泛可用,我们使用开源软件,并提供了一个简单的过程来设计一个定制的智能识别平台,该平台可以轻松地重新用于除 AFM 图像之外的任意结构几何形状的 DNA 结构。在这里,我们描述了获取和生成必要组件以创建此强大系统的方法。从 DNA 结构设计开始,我们详细介绍 AFM 成像、数据点注释、数据扩充、模型训练和推理。为了展示该系统的适应性,我们组装了两种不同的 DNA 折纸结构(三角形和面包板),以便在原始 AFM 图像中进行检测。使用每个结构获取的图像,我们为每个结构训练了两个独特的单一类对象识别模型。通过依次应用这些模型,我们从总共 3617 个结构中正确识别出 3470 个结构,其中有些图像还包括第三个 DNA 折纸结构以及其他杂质。分析在不到 20 秒内完成,使用我们的方法得到的结果的 F1 得分为 0.96。