Zheng Ye, Caron Daniel P, Kim Ju Yeong, Jun Seong-Hwan, Tian Yuan, Florian Mair, Stuart Kenneth D, Sims Peter A, Gottardo Raphael
Basic Science Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA.
Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA.
Res Sq. 2024 Jul 8:rs.3.rs-4572811. doi: 10.21203/rs.3.rs-4572811/v1.
CITE-seq enables paired measurement of surface protein and mRNA expression in single cells using antibodies conjugated to oligonucleotide tags. Due to the high copy number of surface protein molecules, sequencing antibody-derived tags (ADTs) allows for robust protein detection, improving cell-type identification. However, variability in antibody staining leads to batch effects in the ADT expression, obscuring biological variation, reducing interpretability, and obstructing cross-study analyses. Here, we present ADTnorm (https://github.com/yezhengSTAT/ADTnorm), a normalization and integration method designed explicitly for ADT abundance. Benchmarking against 14 existing scaling and normalization methods, we show that ADTnorm accurately aligns populations with negative- and positive-expression of surface protein markers across 13 public datasets, effectively removing technical variation across batches and improving cell-type separation. ADTnorm enables efficient integration of public CITE-seq datasets, each with unique experimental designs, paving the way for atlas-level analyses. Beyond normalization, ADTnorm includes built-in utilities to aid in automated threshold-gating as well as assessment of antibody staining quality for titration optimization and antibody panel selection. Applying ADTnorm to a published COVID-19 CITE-seq dataset allowed for identifying previously undetected disease-associated markers, illustrating a broad utility in biological applications.
CITE-seq能够使用与寡核苷酸标签偶联的抗体对单细胞中的表面蛋白和mRNA表达进行配对测量。由于表面蛋白分子的拷贝数很高,对抗体衍生标签(ADT)进行测序可实现可靠的蛋白检测,从而改善细胞类型识别。然而,抗体染色的变异性会导致ADT表达中的批次效应,掩盖生物学差异,降低可解释性,并阻碍跨研究分析。在这里,我们展示了ADTnorm(https://github.com/yezhengSTAT/ADTnorm),这是一种专门为ADT丰度设计的标准化和整合方法。通过与14种现有的缩放和标准化方法进行基准测试,我们表明ADTnorm能够在13个公共数据集中准确地对齐具有表面蛋白标记物阴性和阳性表达的群体,有效消除批次间的技术差异并改善细胞类型分离。ADTnorm能够有效地整合每个具有独特实验设计的公共CITE-seq数据集,为图谱级分析铺平道路。除了标准化之外,ADTnorm还包括内置实用程序,以帮助进行自动阈值门控以及评估抗体染色质量,以优化滴定和选择抗体组。将ADTnorm应用于已发表的COVID-19 CITE-seq数据集,可以识别以前未检测到的疾病相关标记物,说明了其在生物学应用中的广泛用途。