Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Gene Center and Department of Biochemistry, Ludwig-Maximilians-University Munich, Munich, Germany.
Nat Biotechnol. 2023 Oct;41(10):1465-1473. doi: 10.1038/s41587-023-01657-3. Epub 2023 Feb 16.
Transferring annotations of single-cell-, spatial- and multi-omics data is often challenging owing both to technical limitations, such as low spatial resolution or high dropout fraction, and to biological variations, such as continuous spectra of cell states. Based on the concept that these data are often best described as continuous mixtures of cells or molecules, we present a computational framework for the transfer of annotations to cells and their combinations (TACCO), which consists of an optimal transport model extended with different wrappers to annotate a wide variety of data. We apply TACCO to identify cell types and states, decipher spatiomolecular tissue structure at the cell and molecular level and resolve differentiation trajectories using synthetic and biological datasets. While matching or exceeding the accuracy of specialized tools for the individual tasks, TACCO reduces the computational requirements by up to an order of magnitude and scales to larger datasets (for example, considering the runtime of annotation transfer for 1 M simulated dropout observations).
由于技术限制,如低空间分辨率或高缺失率,以及生物变异,如细胞状态的连续谱,单细胞、空间和多组学数据的注释转移通常具有挑战性。基于这些数据通常最好被描述为细胞或分子的连续混合物的概念,我们提出了一种用于将注释转移到细胞及其组合的计算框架(TACCO),它由一个扩展了不同包装器的最优传输模型组成,用于注释各种数据。我们应用 TACCO 来识别细胞类型和状态,在细胞和分子水平上破译空间分子组织结构,并使用合成和生物学数据集解决分化轨迹。虽然 TACCO 在匹配或超过各个任务的专用工具的准确性方面表现出色,但它通过减少多达一个数量级的计算要求并扩展到更大的数据集(例如,考虑对 100 万模拟缺失观测值进行注释转移的运行时)来实现这一点。