Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas, Rosario, Argentina.
Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, Rosario, Argentina.
Sci Rep. 2022 May 10;12(1):7619. doi: 10.1038/s41598-022-11656-0.
Nucleic-acid barcoding is an enabling technique for many applications, but its use remains limited in emerging long-read sequencing technologies with intrinsically low raw accuracy. Here, we apply so-called NS-watermark barcodes, whose error correction capability was previously validated in silico, in a proof of concept where we synthesize 3840 NS-watermark barcodes and use them to asymmetrically tag and simultaneously sequence amplicons from two evolutionarily distant species (namely Bordetella pertussis and Drosophila mojavensis) on the ONT MinION platform. To our knowledge, this is the largest number of distinct, non-random tags ever sequenced in parallel and the first report of microarray-based synthesis as a source for large oligonucleotide pools for barcoding. We recovered the identity of more than 86% of the barcodes, with a crosstalk rate of 0.17% (i.e., one misassignment every 584 reads). This falls in the range of the index hopping rate of established, high-accuracy Illumina sequencing, despite the increased number of tags and the relatively low accuracy of both microarray-based synthesis and long-read sequencing. The robustness of NS-watermark barcodes, together with their scalable design and compatibility with low-cost massive synthesis, makes them promising for present and future sequencing applications requiring massive labeling, such as long-read single-cell RNA-Seq.
核酸条形码是许多应用的一项支持技术,但在原始准确性内在较低的新兴长读测序技术中,其应用仍然有限。在这里,我们应用了所谓的 NS 水印条形码,其纠错能力之前已经在计算机上得到了验证,在一个概念验证中,我们合成了 3840 个 NS 水印条形码,并将它们用于不对称标记和同时对 ONT MinION 平台上两个进化上相距较远的物种(即百日咳博德特氏菌和莫哈韦沙漠果蝇)的扩增子进行测序。据我们所知,这是迄今为止平行测序的最多独特、非随机标签的数量,也是首次报道基于微阵列的合成作为用于条形码的大寡核苷酸池的来源。我们恢复了超过 86%的条形码的身份,串扰率为 0.17%(即每 584 个读数中有一次误分配)。尽管标签数量增加,以及基于微阵列的合成和长读测序的准确性相对较低,但这仍处于既定的、高精度 Illumina 测序的索引跳跃率范围内。NS 水印条形码的稳健性,加上其可扩展的设计和与低成本大规模合成的兼容性,使其有望用于目前和未来需要大规模标记的测序应用,例如长读单细胞 RNA-Seq。