Suzhou Institute of Systems Medicine, Center for Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Jiangsu, Suzhou, China.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa358.
Tissue immune cells have long been recognized as important regulators for the maintenance of balance in the body system. Quantification of the abundance of different immune cells will provide enhanced understanding of the correlation between immune cells and normal or abnormal situations. Currently, computational methods to predict tissue immune cell compositions from bulk transcriptomes have been largely developed. Therefore, summarizing the advantages and disadvantages is appropriate. In addition, an examination of the challenges and possible solutions for these computational models will assist the development of this field. The common hypothesis of these models is that the expression of signature genes for immune cell types might represent the proportion of immune cells that contribute to the tissue transcriptome. In general, we grouped all reported tools into three groups, including reference-free, reference-based scoring and reference-based deconvolution methods. In this review, a summary of all the currently reported computational immune cell quantification tools and their applications, limitations, and perspectives are presented. Furthermore, some critical problems are found that have limited the performance and application of these models, including inadequate immune cell type, the collinearity problem, the impact of the tissue environment on the immune cell expression level, and the deficiency of standard datasets for model validation. To address these issues, tissue specific training datasets that include all known immune cells, a hierarchical computational framework, and benchmark datasets including both tissue expression profiles and the abundances of all the immune cells are proposed to further promote the development of this field.
组织免疫细胞一直被认为是维持体内系统平衡的重要调节剂。对不同免疫细胞丰度的定量分析将有助于深入了解免疫细胞与正常或异常情况之间的相关性。目前,已经开发出了从大量转录组数据中预测组织免疫细胞组成的计算方法。因此,对这些方法的优缺点进行总结是恰当的。此外,对这些计算模型所面临的挑战和可能的解决方案进行考察,将有助于该领域的发展。这些模型的共同假设是,免疫细胞类型的特征基因的表达可能代表了对组织转录组有贡献的免疫细胞的比例。总的来说,我们将所有报道的工具分为三组,包括无参考、基于参考评分和基于参考去卷积方法。在本综述中,总结了所有目前报道的计算免疫细胞定量工具及其应用、局限性和展望。此外,还发现了一些限制这些模型性能和应用的关键问题,包括免疫细胞类型不足、共线性问题、组织环境对免疫细胞表达水平的影响以及模型验证的标准数据集的缺乏。为了解决这些问题,提出了组织特异性训练数据集,包括所有已知的免疫细胞、分层计算框架以及包含组织表达谱和所有免疫细胞丰度的基准数据集,以进一步推动该领域的发展。