Chen Sijie, Luo Yanting, Gao Haoxiang, Li Fanhong, Chen Yixin, Li Jiaqi, You Renke, Hao Minsheng, Bian Haiyang, Xi Xi, Li Wenrui, Li Weiyu, Ye Mingli, Meng Qiuchen, Zou Ziheng, Li Chen, Li Haochen, Zhang Yangyuan, Cui Yanfei, Wei Lei, Chen Fufeng, Wang Xiaowo, Lv Hairong, Hua Kui, Jiang Rui, Zhang Xuegong
MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China.
Fuzhou Institute of Data Technology, Changle, Fuzhou 350200, China.
iScience. 2022 Apr 28;25(5):104318. doi: 10.1016/j.isci.2022.104318. eCollection 2022 May 20.
The accumulation of massive single-cell omics data provides growing resources for building biomolecular atlases of all cells of human organs or the whole body. The true assembly of a cell atlas should be cell-centric rather than file-centric. We developed a unified informatics framework for seamless cell-centric data assembly and built the human Ensemble Cell Atlas (hECA) from scattered data. hECA v1.0 assembled 1,093,299 labeled human cells from 116 published datasets, covering 38 organs and 11 systems. We invented three new methods of atlas applications based on the cell-centric assembly: "" cell sorting for targeted data retrieval with customizable logic expressions, "quantitative portraiture" for multi-view representations of biological entities, and customizable reference creation for generating references for automatic annotations. Case studies on agile construction of user-defined sub-atlases and "" investigation of CAR-T off-targets in multiple organs showed the great potential enabled by the cell-centric ensemble atlas.
海量单细胞组学数据的积累为构建人体器官或全身所有细胞的生物分子图谱提供了越来越多的资源。细胞图谱的真正组装应以细胞为中心,而不是以文件为中心。我们开发了一个统一的信息学框架,用于以细胞为中心的无缝数据组装,并从分散的数据构建了人类整合细胞图谱(hECA)。hECA v1.0从116个已发表的数据集中组装了1,093,299个标记的人类细胞,覆盖38个器官和11个系统。我们基于以细胞为中心的组装发明了三种新的图谱应用方法:使用可定制逻辑表达式进行靶向数据检索的“细胞分选”、生物实体多视图表示的“定量画像”以及用于生成自动注释参考的可定制参考创建。关于灵活构建用户定义的子图谱以及对多个器官中CAR-T脱靶情况进行研究的案例分析表明,以细胞为中心的整合图谱具有巨大潜力。