Suppr超能文献

scMultiSim:由细胞间相互作用和基因调控网络引导的多模态单细胞数据模拟。

scMultiSim: simulation of multi-modality single cell data guided by cell-cell interactions and gene regulatory networks.

作者信息

Li Hechen, Zhang Ziqi, Squires Michael, Chen Xi, Zhang Xiuwei

机构信息

Georgia Institute of Technology, Atlanta, USA.

Southern University of Science and Technology, China.

出版信息

Res Sq. 2023 Mar 15:rs.3.rs-2675530. doi: 10.21203/rs.3.rs-2675530/v1.

Abstract

Simulated single-cell data is essential for designing and evaluating computational methods in the absence of experimental ground truth. Existing simulators typically focus on modeling one or two specific biological factors or mechanisms that affect the output data, which limits their capacity to simulate the complexity and multi-modality in real data. Here, we present scMultiSim, an simulator that generates multi-modal single-cell data, including gene expression, chromatin accessibility, RNA velocity, and spatial cell locations while accounting for the relationships between modalities. scMultiSim jointly models various biological factors that affect the output data, including cell identity, within-cell gene regulatory networks (GRNs), cell-cell interactions (CCIs), and chromatin accessibility, while also incorporating technical noises. Moreover, it allows users to adjust each factor's effect easily. We validated scMultiSim's simulated biological effects and demonstrated its applications by benchmarking a wide range of computational tasks, including cell clustering and trajectory inference, multi-modal and multi-batch data integration, RNA velocity estimation, GRN inference and CCI inference using spatially resolved gene expression data. Compared to existing simulators, scMultiSim can benchmark a much broader range of existing computational problems and even new potential tasks.

摘要

在缺乏实验真实数据的情况下,模拟单细胞数据对于设计和评估计算方法至关重要。现有的模拟器通常专注于对影响输出数据的一两个特定生物因素或机制进行建模,这限制了它们模拟真实数据中复杂性和多模态性的能力。在此,我们展示了scMultiSim,这是一种生成多模态单细胞数据的模拟器,包括基因表达、染色质可及性、RNA速度和空间细胞位置,同时考虑了各模态之间的关系。scMultiSim联合对影响输出数据的各种生物因素进行建模,包括细胞身份、细胞内基因调控网络(GRN)、细胞间相互作用(CCI)和染色质可及性,同时还纳入了技术噪声。此外,它允许用户轻松调整每个因素的影响。我们验证了scMultiSim模拟的生物学效应,并通过对广泛的计算任务进行基准测试来展示其应用,包括细胞聚类和轨迹推断、多模态和多批次数据整合、RNA速度估计、GRN推断以及使用空间分辨基因表达数据的CCI推断。与现有的模拟器相比,scMultiSim可以对更广泛的现有计算问题甚至新的潜在任务进行基准测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0df8/10055660/a2a8dadb198b/nihpp-rs2675530v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验