Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA.
Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
Nat Commun. 2020 Feb 13;11(1):866. doi: 10.1038/s41467-020-14667-5.
A widespread assumption for single-cell analyses specifies that one cell's nucleic acids are predominantly captured by one oligonucleotide barcode. Here, we show that ~13-21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call "barcode multiplets". We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that approximately 4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that approximately 5% of beads may contain detectable levels of multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.
单细胞分析的一个普遍假设是,一个细胞的核酸主要被一个寡核苷酸条形码捕获。在这里,我们表明,10x Chromium scATAC-seq 分析中约 13-21%的细胞条形码可能来自于一个含有多个寡核苷酸序列的液滴,我们称之为“条形码多联体”。我们证明条形码多联体可以至少有两种不同的来源。首先,我们证实,10x 平台上约有 4%的液滴可能含有多个珠子。此外,我们发现大约 5%的珠子可能含有可检测水平的多个寡核苷酸条形码。我们表明,这种人为产物会混淆单细胞分析,包括对肿瘤内淋巴细胞克隆多样性和增殖的解释。总的来说,我们的工作为识别和评估单细胞数据中条形码多联体的影响提供了一个概念和计算框架。