Lee Alex J, Cahill Robert, Abbasi-Asl Reza
University of California, San Francisco.
ArXiv. 2023 Mar 29:arXiv:2303.16725v1.
Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.
多细胞系统中的发育和稳态都需要对空间分子模式的形成和维持进行精确控制。诸如多重免疫荧光和空间转录组学(ST)等空间分辨和高通量分子成像方法的进展,为增强我们对健康和疾病中这些过程的基本理解提供了令人兴奋的新机会。这些技术,特别是ST,产生的大量复杂数据集,推动了主要基于深度学习技术的创新机器学习(ML)工具的快速发展。这些ML工具现在越来越多地出现在综合实验和计算工作流程中,以在复杂生物系统中从噪声中分离信号。然而,要理解和平衡ST中快速扩展的分析工具库的不同隐含假设和方法可能很困难。为了解决这个问题,我们总结了ML可以帮助解决的主要ST分析目标和当前的分析趋势。我们还描述了四个主要的数据科学概念和相关启发式方法,它们可以帮助指导从业者为正确的生物学问题选择合适的工具。