通用测序读码器。

A universal sequencing read interpreter.

机构信息

School of Biomedical Engineering, Faculty of Applied Science and Faculty of Medicine, The University of British Columbia, Vancouver, BC V6T 1Z3, Canada.

Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo 153-8904, Japan.

出版信息

Sci Adv. 2023 Jan 4;9(1):eadd2793. doi: 10.1126/sciadv.add2793.

DOI:10.1126/sciadv.add2793

PMID:36598975

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9812397/

Abstract

Massively parallel DNA sequencing has led to the rapid growth of highly multiplexed experiments in biology. These experiments produce unique sequencing results that require specific analysis pipelines to decode highly structured reads. However, no versatile framework that interprets sequencing reads to extract their encoded information for downstream biological analysis has been developed. Here, we report INTERSTELLAR (interpretation, scalable transformation, and emulation of large-scale sequencing reads) that decodes data values encoded in theoretically any type of sequencing read and translates them into sequencing reads of another structure of choice. We demonstrated that INTERSTELLAR successfully extracted information from a range of short- and long-read sequencing reads and translated those of single-cell (sc)RNA-seq, scATAC-seq, and spatial transcriptomics to be analyzed by different software tools that have been developed for conceptually the same types of experiments. INTERSTELLAR will greatly facilitate the development of sequencing-based experiments and sharing of data analysis pipelines.

摘要

大规模并行 DNA 测序技术推动了生物学中高度多重实验的快速发展。这些实验产生了独特的测序结果，需要特定的分析管道来解码高度结构化的读取。然而，还没有开发出一种通用的框架来解释测序读取，以提取其编码信息进行下游生物学分析。在这里，我们报告了 INTERSTELLAR（解释、可扩展转换和模拟大规模测序读取），它可以解码理论上任何类型测序读取中编码的数据值，并将其转换为另一种结构的测序读取。我们证明了 INTERSTELLAR 能够成功地从各种短读和长读测序读取中提取信息，并将单细胞 (sc)RNA-seq、scATAC-seq 和空间转录组学的测序读取转换为可用于不同软件工具的分析，这些工具是为概念上相同类型的实验开发的。INTERSTELLAR 将极大地促进基于测序的实验的发展和数据分析管道的共享。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/996c/9812397/e17d5b55ea97/sciadv.add2793-f1.jpg

相似文献

A universal sequencing read interpreter.

Sci Adv. 2023 Jan 4;9(1):eadd2793. doi: 10.1126/sciadv.add2793.

Split Pool Ligation-based Single-cell Transcriptome sequencing (SPLiT-seq) data processing pipeline comparison.

BMC Genomics. 2024 Apr 12;25(1):361. doi: 10.1186/s12864-024-10285-3.

Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm.

Bioinformatics. 2020 Jun 1;36(12):3669-3679. doi: 10.1093/bioinformatics/btaa179.

Mapping accuracy of short reads from massively parallel sequencing and the implications for quantitative expression profiling.

PLoS One. 2009 Jul 28;4(7):e6323. doi: 10.1371/journal.pone.0006323.

Querying large read collections in main memory: a versatile data structure.

BMC Bioinformatics. 2011 Jun 17;12:242. doi: 10.1186/1471-2105-12-242.

Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing.

BMC Bioinformatics. 2010 Nov 23;11:572. doi: 10.1186/1471-2105-11-572.

Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols.

BMC Genomics. 2017 May 23;18(1):399. doi: 10.1186/s12864-017-3797-0.

Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA.

Proc Natl Acad Sci U S A. 2018 Sep 25;115(39):9726-9731. doi: 10.1073/pnas.1806447115. Epub 2018 Sep 10.

VirION2: a short- and long-read sequencing and informatics workflow to study the genomic diversity of viruses in nature.

PeerJ. 2021 Mar 30;9:e11088. doi: 10.7717/peerj.11088. eCollection 2021.

Improving PacBio long read accuracy by short read alignment.

PLoS One. 2012;7(10):e46679. doi: 10.1371/journal.pone.0046679. Epub 2012 Oct 4.

引用本文的文献

A multi-kingdom genetic barcoding system for precise clone isolation.

Nat Biotechnol. 2025 May 21. doi: 10.1038/s41587-025-02649-1.

Flexible parsing, interpretation, and editing of technical sequences with splitcode.

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae331.

Applications of single‑cell omics and spatial transcriptomics technologies in gastric cancer (Review).

Oncol Lett. 2024 Feb 14;27(4):152. doi: 10.3892/ol.2024.14285. eCollection 2024 Apr.

Flexible parsing, interpretation, and editing of technical sequences with splitcode.

bioRxiv. 2023 Dec 9:2023.03.20.533521. doi: 10.1101/2023.03.20.533521.

本文引用的文献

PacRAT: a program to improve barcode-variant mapping from PacBio long reads using multiple sequence alignment.

Bioinformatics. 2022 May 13;38(10):2927-2929. doi: 10.1093/bioinformatics/btac165.

Spatial genomics enables multi-modal study of clonal heterogeneity in tissues.

Nature. 2022 Jan;601(7891):85-91. doi: 10.1038/s41586-021-04217-4. Epub 2021 Dec 15.

Single-cell chromatin state analysis with Signac.

Nat Methods. 2021 Nov;18(11):1333-1341. doi: 10.1038/s41592-021-01282-5. Epub 2021 Nov 1.

Embryo-scale, single-cell spatial transcriptomics.

Science. 2021 Jul 2;373(6550):111-117. doi: 10.1126/science.abb9536.

Multiplexing mutation rate assessment: determining pathogenicity of Msh2 variants in Saccharomyces cerevisiae.

Genetics. 2021 Jun 24;218(2). doi: 10.1093/genetics/iyab058.

Comprehensive analysis of single cell ATAC-seq data with SnapATAC.

Nat Commun. 2021 Feb 26;12(1):1337. doi: 10.1038/s41467-021-21583-9.

Long-read human genome sequencing and its applications.

Nat Rev Genet. 2020 Oct;21(10):597-614. doi: 10.1038/s41576-020-0236-x. Epub 2020 Jun 5.

Systematic comparison of single-cell and single-nucleus RNA-sequencing methods.

Nat Biotechnol. 2020 Jun;38(6):737-746. doi: 10.1038/s41587-020-0465-8. Epub 2020 Apr 6.

Lineage tracing on transcriptional landscapes links state to fate during differentiation.

Science. 2020 Feb 14;367(6479). doi: 10.1126/science.aaw3381. Epub 2020 Jan 23.

The role of structural pleiotropy and regulatory evolution in the retention of heteromers of paralogs.

Elife. 2019 Aug 27;8:e46754. doi: 10.7554/eLife.46754.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通用测序读码器。

A universal sequencing read interpreter.

机构信息

School of Biomedical Engineering, Faculty of Applied Science and Faculty of Medicine, The University of British Columbia, Vancouver, BC V6T 1Z3, Canada.

Research Center for Advanced Science and Technology, The University of Tokyo, Tokyo 153-8904, Japan.

出版信息

Sci Adv. 2023 Jan 4;9(1):eadd2793. doi: 10.1126/sciadv.add2793.

DOI:10.1126/sciadv.add2793

PMID:36598975

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9812397/

Abstract

摘要

通用测序读码器。

A universal sequencing read interpreter.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

通用测序读码器。

A universal sequencing read interpreter.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献