Massachusetts Institute of Technology, Cambridge, MA, USA.
Mechanobiology Institute, National University of Singapore, Singapore, Singapore.
Nat Commun. 2021 Jan 4;12(1):31. doi: 10.1038/s41467-020-20249-2.
The development of single-cell methods for capturing different data modalities including imaging and sequencing has revolutionized our ability to identify heterogeneous cell states. Different data modalities provide different perspectives on a population of cells, and their integration is critical for studying cellular heterogeneity and its function. While various methods have been proposed to integrate different sequencing data modalities, coupling imaging and sequencing has been an open challenge. We here present an approach for integrating vastly different modalities by learning a probabilistic coupling between the different data modalities using autoencoders to map to a shared latent space. We validate this approach by integrating single-cell RNA-seq and chromatin images to identify distinct subpopulations of human naive CD4+ T-cells that are poised for activation. Collectively, our approach provides a framework to integrate and translate between data modalities that cannot yet be measured within the same cell for diverse applications in biomedical discovery.
单细胞方法的发展,包括成像和测序等不同数据模态的捕获,极大地提高了我们识别异质细胞状态的能力。不同的数据模态为细胞群体提供了不同的视角,它们的整合对于研究细胞异质性及其功能至关重要。虽然已经提出了各种方法来整合不同的测序数据模态,但将成像和测序相结合一直是一个开放性的挑战。我们在这里提出了一种通过使用自动编码器学习不同数据模态之间的概率耦合,将不同模态整合到一个共享的潜在空间中的方法。我们通过整合单细胞 RNA-seq 和染色质图像来识别人类幼稚 CD4+ T 细胞中准备激活的不同亚群,验证了这种方法。总的来说,我们的方法为整合和转换不同的数据模态提供了一个框架,这些模态在同一细胞内还无法测量,可应用于生物医学发现的多个领域。