Suppr超能文献

PhenoComb:一种用于评估高维单细胞数据集中复杂表型的发现工具。

PhenoComb: a discovery tool to assess complex phenotypes in high-dimensional single-cell datasets.

作者信息

Burke Paulo E P, Strange Ann, Monk Emily, Thompson Brian, Amato Carol M, Woods David M

机构信息

Division of Medical Oncology, Department of Medicine, University of Colorado School of Medicine, Aurora, CO 80045, USA.

出版信息

Bioinform Adv. 2022 Aug 3;2(1):vbac052. doi: 10.1093/bioadv/vbac052. eCollection 2022.

Abstract

MOTIVATION

High-dimensional cytometry assays can simultaneously measure dozens of markers, enabling the investigation of complex phenotypes. However, as manual gating relies on previous biological knowledge, few marker combinations are often assessed. This results in complex phenotypes with the potential for biological relevance being overlooked. Here, we present PhenoComb, an R package that allows agnostic exploration of phenotypes by assessing all combinations of markers. PhenoComb uses signal intensity thresholds to assign markers to discrete states (e.g. negative, low, high) and then counts the number of cells per sample from all possible marker combinations in a memory-safe manner. Time and disk space are the only constraints on the number of markers evaluated. PhenoComb also provides several approaches to perform statistical comparisons, evaluate the relevance of phenotypes and assess the independence of identified phenotypes. PhenoComb allows users to guide analysis by adjusting several function arguments, such as identifying parent populations of interest, filtering of low-frequency populations and defining a maximum complexity of phenotypes to evaluate. We have designed PhenoComb to be compatible with a local computer or server-based use.

RESULTS

In testing of PhenoComb's performance on synthetic datasets, computation on 16 markers was completed in the scale of minutes and up to 26 markers in hours. We applied PhenoComb to two publicly available datasets: an HIV flow cytometry dataset (12 markers and 421 samples) and the COVIDome CyTOF dataset (40 markers and 99 samples). In the HIV dataset, PhenoComb identified immune phenotypes associated with HIV seroconversion, including those highlighted in the original publication. In the COVID dataset, we identified several immune phenotypes with altered frequencies in infected individuals relative to healthy individuals. Collectively, PhenoComb represents a powerful discovery tool for agnostically assessing high-dimensional single-cell data.

AVAILABILITY AND IMPLEMENTATION

The PhenoComb R package can be downloaded from https://github.com/SciOmicsLab/PhenoComb.

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

高维细胞计数分析能够同时测量数十种标志物,从而对复杂表型进行研究。然而,由于手动设门依赖于先前的生物学知识,常常只能评估少数标志物组合。这就导致具有潜在生物学相关性的复杂表型被忽视。在此,我们展示了PhenoComb,这是一个R软件包,可通过评估标志物的所有组合对表型进行无偏倚探索。PhenoComb使用信号强度阈值将标志物分配到离散状态(例如阴性、低、高),然后以内存安全的方式计算每个样本中所有可能标志物组合的细胞数量。时间和磁盘空间是评估标志物数量的唯一限制因素。PhenoComb还提供了几种进行统计比较、评估表型相关性以及评估所鉴定表型独立性的方法。PhenoComb允许用户通过调整几个函数参数来指导分析,例如识别感兴趣的亲本群体、过滤低频群体以及定义要评估的表型的最大复杂性。我们将PhenoComb设计为与本地计算机或基于服务器的使用兼容。

结果

在对合成数据集测试PhenoComb的性能时,对16个标志物的计算在数分钟内完成,对多达26个标志物的计算在数小时内完成。我们将PhenoComb应用于两个公开可用的数据集:一个HIV流式细胞术数据集(12个标志物和421个样本)以及COVIDome CyTOF数据集(40个标志物和99个样本)。在HIV数据集中,PhenoComb鉴定出了与HIV血清转化相关的免疫表型,包括原始出版物中突出显示的那些。在COVID数据集中,我们鉴定出了几个在感染个体中相对于健康个体频率发生改变的免疫表型。总体而言,PhenoComb是一种用于无偏倚评估高维单细胞数据的强大发现工具。

可用性与实现方式

PhenoComb R软件包可从https://github.com/SciOmicsLab/PhenoComb下载。

补充信息

补充数据可在网上获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验