Department of Informatics, J. Craig Venter Institute, La Jolla, CA 92037, USA.
Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14642, USA.
Bioinformatics. 2022 Oct 14;38(20):4735-4744. doi: 10.1093/bioinformatics/btac585.
Flow cytometry (FCM) and transcription profiling are the two widely used assays in translational immunology research. However, there is no data integration pipeline for analyzing these two types of assays together with experiment variables for biomarker inference. Current FCM data analysis mainly relies on subjective manual gating analysis, which is difficult to be directly integrated with other automated computational methods. Existing deconvolutional analysis of bulk transcriptomics relies on predefined marker genes in the transcriptomics data, which are unavailable for novel cell types and does not utilize the FCM data that provide canonical phenotypic definitions of the cell types.
We developed a novel analytics pipeline-FastMix-for computational immunology, which integrates flow cytometry, bulk transcriptomics and clinical covariates for identifying cell type-specific gene expression signatures and biomarker genes. FastMix addresses the 'large p, small n' problem in the gene expression and flow cytometry integration analysis via a linear mixed effects model (LMER) for both cross-sectional and longitudinal studies. Its novel moment-based estimator not only reduces bias in parameter estimation but also is more efficient than iterative optimization. The FastMix pipeline also includes a cutting-edge flow cytometry data analysis method-DAFi-for identifying cell populations of interest and their characteristics. Simulation studies showed that FastMix produced smaller type I/II errors than competing methods. Validation using real data of two vaccine studies showed that FastMix identified a consistent set of signature genes as in independent single-cell RNA-seq analysis, producing additional interesting findings.
Source code of FastMix is publicly available at https://github.com/terrysun0302/FastMix.
Supplementary data are available at Bioinformatics online.
流式细胞术 (FCM) 和转录谱分析是转化免疫学研究中广泛使用的两种检测方法。然而,目前还没有用于同时分析这两种检测方法以及用于推断生物标志物的实验变量的数据集成管道。当前的 FCM 数据分析主要依赖于主观的手动门控分析,难以直接与其他自动化计算方法集成。现有的批量转录组学去卷积分析依赖于转录组学数据中预先定义的标记基因,这些标记基因对于新型细胞类型不可用,并且不能利用提供细胞类型典型表型定义的 FCM 数据。
我们开发了一种新的分析管道——FastMix,用于计算免疫学,该管道整合了流式细胞术、批量转录组学和临床协变量,用于识别细胞类型特异性基因表达特征和生物标志物基因。FastMix 通过线性混合效应模型 (LMER) 解决了基因表达和流式细胞术集成分析中的“大 p、小 n”问题,适用于横断面和纵向研究。其新颖的基于矩的估计器不仅减少了参数估计中的偏差,而且比迭代优化更有效。FastMix 管道还包括一种前沿的流式细胞术数据分析方法——DAFi,用于识别感兴趣的细胞群体及其特征。模拟研究表明,FastMix 产生的 I/II 型错误比竞争方法更小。使用两项疫苗研究的真实数据进行验证表明,FastMix 确定了一组与独立单细胞 RNA-seq 分析一致的特征基因,并产生了其他有趣的发现。
FastMix 的源代码可在 https://github.com/terrysun0302/FastMix 上公开获取。
补充数据可在生物信息学在线获得。