克隆VDJ序列分析:一种用于克隆文库中VDJ序列测序、存档和分析的工作流程及生物信息学管理系统。

clonevdjseq: A workflow and bioinformatics management system for sequencing, archiving, and analysis of VDJ sequences from clonal libraries.

作者信息

Mitchell Keith, Hunter Samuel, Froenicke Lutz, Murray Karl, Settles Matthew, Trimmer James S

机构信息

Department of Physiology and Membrane Biology, University of California Davis School of Medicine, Davis, CA, USA.

Bioinformatics Core, Genome Center, University of California Davis, Davis, CA, USA.

出版信息

BMC Bioinformatics. 2025 Jul 21;26(1):186. doi: 10.1186/s12859-025-06107-2.

Abstract

BACKGROUND

Advances in next-generation sequencing technologies have facilitated extensive analysis of B cell and T cell receptor (BCR/TCR, respectively) sequences from monoclonal hybridoma libraries, single B cells, and single T cells, generating vast amounts of important data pertaining to antigen recognition. However, existing workflows and bioinformatics tools often lack the flexibility and scalability needed to handle large clonal level datasets effectively. An initial system and hybridoma dependent version of this code was distributed as part of the NeuroMabSeq publication, but clonevdjseq aims to be a technical addendum for broader system compatibility and enhanced modeling.

RESULTS

We present clonevdjseq, an integrated and accessible software solution leveraging nextflow and Django. Developed primarily for large hybridoma libraries, the workflow and pipeline is amenable to BCR/TCR sequence analysis of homogenous populations or clones of B and T cells, respectively. The clonevdjseq pipeline includes modules for read processing, amplicon denoising, and quality control of paired variable light/heavy chains of BCRs from B cells and hybridomas, or alpha(ɑ)/beta(β) and delta(δ)/gamma(γ) chains of TCRs in the case of T cell applications. The pipeline is built upon a robust, high-throughput library prep protocol, upon which processed data has been verified across thousands of monoclonal antibodies. The results of this effort has yielded sequences used to develop functional recombinant monoclonal antibodies and single chain variable fragments as a part of the NeuroMabSeq initiative where thousands of hybridoma samples were processed (Mitchell et al. in Sci Rep 13(1):16200, 2023) as well as provide additional modeling and extensibility to other modalities. The clonevdjseq software is accessible via Nextflow and also offers a database and web app as a final optional step in the processing for dissemination of results and data exploration.

CONCLUSIONS

clonevdjseq offers a comprehensive and scalable solution for the processing and analysis of large monoclonal and oligoclonal VDJ datasets. Its modular design, dynamic pipeline, and robust database integration facilitate efficient data management and analysis. The platform is publicly available and aims to support the research community by providing an accessible and flexible tool for archiving and dissemination of BCR sequences from hybridomas, with applicability for other applications such as TCR sequences from single-cell T cell populations.

摘要

背景

新一代测序技术的进步推动了对来自单克隆杂交瘤文库、单个B细胞和单个T细胞的B细胞受体和T细胞受体(分别为BCR/TCR)序列的广泛分析,产生了大量与抗原识别相关的重要数据。然而,现有的工作流程和生物信息学工具往往缺乏有效处理大型克隆水平数据集所需的灵活性和可扩展性。作为NeuroMabSeq出版物的一部分,发布了该代码的初始系统和依赖杂交瘤的版本,但clonevdjseq旨在作为一个技术附录,以实现更广泛的系统兼容性和增强的建模。

结果

我们展示了clonevdjseq,这是一个利用Nextflow和Django的集成且易于使用的软件解决方案。该工作流程和管道主要为大型杂交瘤文库开发,适用于分别对B细胞和T细胞的同质群体或克隆进行BCR/TCR序列分析。clonevdjseq管道包括用于读取处理、扩增子去噪以及对来自B细胞和杂交瘤的BCR的配对可变轻/重链进行质量控制的模块,对于T细胞应用,则包括对TCR的α(ɑ)/β(β)和δ(δ)/γ(γ)链进行质量控制的模块。该管道基于一个强大的高通量文库制备协议构建,处理后的数据已在数千种单克隆抗体上得到验证。这项工作的结果产生了用于开发功能性重组单克隆抗体和单链可变片段的序列,作为NeuroMabSeq计划的一部分,该计划处理了数千个杂交瘤样本(Mitchell等人,《科学报告》13(1):16200,2023),同时还为其他模式提供了额外的建模和可扩展性。clonevdjseq软件可通过Nextflow访问,并且在处理结果传播和数据探索的最后一个可选步骤中还提供了一个数据库和网络应用程序。

结论

clonevdjseq为处理和分析大型单克隆和寡克隆VDJ数据集提供了一个全面且可扩展的解决方案。其模块化设计、动态管道和强大的数据库集成有助于高效的数据管理和分析。该平台是公开可用的,旨在通过提供一个易于使用且灵活的工具来支持研究社区,用于存档和传播来自杂交瘤的BCR序列,并适用于其他应用,如来自单细胞T细胞群体的TCR序列。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索