School of Biomedical Engineering, Faculty of Applied Science and Faculty of Medicine, The University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo 153-8904, Japan.
Sci Adv. 2023 Jan 4;9(1):eadd2793. doi: 10.1126/sciadv.add2793.
Massively parallel DNA sequencing has led to the rapid growth of highly multiplexed experiments in biology. These experiments produce unique sequencing results that require specific analysis pipelines to decode highly structured reads. However, no versatile framework that interprets sequencing reads to extract their encoded information for downstream biological analysis has been developed. Here, we report INTERSTELLAR (interpretation, scalable transformation, and emulation of large-scale sequencing reads) that decodes data values encoded in theoretically any type of sequencing read and translates them into sequencing reads of another structure of choice. We demonstrated that INTERSTELLAR successfully extracted information from a range of short- and long-read sequencing reads and translated those of single-cell (sc)RNA-seq, scATAC-seq, and spatial transcriptomics to be analyzed by different software tools that have been developed for conceptually the same types of experiments. INTERSTELLAR will greatly facilitate the development of sequencing-based experiments and sharing of data analysis pipelines.
大规模并行 DNA 测序技术推动了生物学中高度多重实验的快速发展。这些实验产生了独特的测序结果,需要特定的分析管道来解码高度结构化的读取。然而,还没有开发出一种通用的框架来解释测序读取,以提取其编码信息进行下游生物学分析。在这里,我们报告了 INTERSTELLAR(解释、可扩展转换和模拟大规模测序读取),它可以解码理论上任何类型测序读取中编码的数据值,并将其转换为另一种结构的测序读取。我们证明了 INTERSTELLAR 能够成功地从各种短读和长读测序读取中提取信息,并将单细胞 (sc)RNA-seq、scATAC-seq 和空间转录组学的测序读取转换为可用于不同软件工具的分析,这些工具是为概念上相同类型的实验开发的。INTERSTELLAR 将极大地促进基于测序的实验的发展和数据分析管道的共享。