Chan Zuckerberg Biohub, San Francisco, California, United States of America.
PLoS Comput Biol. 2021 Aug 9;17(8):e1009274. doi: 10.1371/journal.pcbi.1009274. eCollection 2021 Aug.
Recent advancements in in situ methods, such as multiplexed in situ RNA hybridization and in situ RNA sequencing, have deepened our understanding of the way biological processes are spatially organized in tissues. Automated image processing and spot-calling algorithms for analyzing in situ transcriptomics images have many parameters which need to be tuned for optimal detection. Having ground truth datasets (images where there is very high confidence on the accuracy of the detected spots) is essential for evaluating these algorithms and tuning their parameters. We present a first-in-kind open-source toolkit and framework for in situ transcriptomics image analysis that incorporates crowdsourced annotations, alongside expert annotations, as a source of ground truth for the analysis of in situ transcriptomics images. The kit includes tools for preparing images for crowdsourcing annotation to optimize crowdsourced workers' ability to annotate these images reliably, performing quality control (QC) on worker annotations, extracting candidate parameters for spot-calling algorithms from sample images, tuning parameters for spot-calling algorithms, and evaluating spot-calling algorithms and worker performance. These tools are wrapped in a modular pipeline with a flexible structure that allows users to take advantage of crowdsourced annotations from any source of their choice. We tested the pipeline using real and synthetic in situ transcriptomics images and annotations from the Amazon Mechanical Turk system obtained via Quanti.us. Using real images from in situ experiments and simulated images produced by one of the tools in the kit, we studied worker sensitivity to spot characteristics and established rules for annotation QC. We explored and demonstrated the use of ground truth generated in this way for validating spot-calling algorithms and tuning their parameters, and confirmed that consensus crowdsourced annotations are a viable substitute for expert-generated ground truth for these purposes.
近年来,原位方法(如多重原位 RNA 杂交和原位 RNA 测序)的进展加深了我们对生物过程在组织中空间组织方式的理解。用于分析原位转录组学图像的自动化图像处理和斑点调用算法有许多参数需要进行调整以实现最佳检测。拥有真实数据集(即对检测到的斑点的准确性具有高度置信度的图像)对于评估这些算法和调整其参数至关重要。我们提出了一种用于原位转录组学图像分析的首创开源工具包和框架,该框架将众包注释与专家注释结合在一起,作为原位转录组学图像分析的真实数据集的来源。该工具包包括用于准备众包注释的图像的工具,以优化众包工作人员可靠注释这些图像的能力,对工作人员注释进行质量控制(QC),从样本图像中提取斑点调用算法的候选参数,调整斑点调用算法的参数,并评估斑点调用算法和工作人员的性能。这些工具被包装在一个具有灵活结构的模块化管道中,允许用户利用他们选择的任何来源的众包注释。我们使用来自 Amazon Mechanical Turk 系统的 Quanti.us 获得的真实和合成原位转录组学图像以及注释来测试该管道。使用来自原位实验的真实图像和该工具包中的一种工具生成的模拟图像,我们研究了工作人员对斑点特征的敏感性,并为注释 QC 制定了规则。我们探索并展示了以这种方式生成的真实数据集在验证斑点调用算法和调整其参数方面的用途,并证实共识众包注释是这些目的的专家生成真实数据集的可行替代品。