Suppr超能文献

基于高性能计算的免疫受体谱分析:AIRR 社区的一种方法。

Immune Repertoire Analysis on High-Performance Computing Using VDJServer V1: A Method by the AIRR Community.

机构信息

Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, USA.

Center for Translational Medicine, Immunology, and Transplantation, Immundiagnostik, Marien Hospital Herne, University Hospital of the Ruhr-University Bochum, Herne, Germany.

出版信息

Methods Mol Biol. 2022;2453:439-446. doi: 10.1007/978-1-0716-2115-8_22.

Abstract

AIRR-seq data sets are usually large and require specialized analysis methods and software tools. A typical Illumina MiSeq sequencing run generates 20-30 million 2 × 300 bp paired-end sequence reads, which roughly corresponds to 15 GB of sequence data to be processed. Other platforms like NextSeq, which is useful in projects where the full V gene is not needed, create about 400 million 2 × 150 bp paired-end reads. Because of the size of the data sets, the analysis can be computationally expensive, particularly the early analysis steps like preprocessing and gene annotation that process the majority of the sequence data. A standard desktop PC may take 3-5 days of constant processing for a single MiSeq run, so dedicated high-performance computational resources may be required.VDJServer provides free access to high-performance computing (HPC) at the Texas Advanced Computing Center (TACC) through a graphical user interface (Christley et al. Front Immunol 9:976, 2018). VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provides access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene assignment, repertoire characterization, and repertoire comparison. Furthermore, VDJServer has parallelized execution for tools such as IgBLAST, so more compute resources are utilized as the size of the input data grows. Analysis that takes days on a desktop PC might take only a few hours on VDJServer. VDJServer is a free, publicly available, and open-source licensed resource. Here, we describe the workflow for performing immune repertoire analysis on VDJServer's high-performance computing.

摘要

AIRR-seq 数据集通常较大,需要专门的分析方法和软件工具。典型的 Illumina MiSeq 测序运行生成 20-3000 万个 2×300bp 配对末端序列读取,大致对应 15GB 待处理的序列数据。其他平台,如 NextSeq,在不需要完整 V 基因的项目中很有用,生成约 4 亿个 2×150bp 配对末端读取。由于数据集的大小,分析可能在计算上很昂贵,特别是像预处理和基因注释这样的早期分析步骤,这些步骤处理了大部分序列数据。单个 MiSeq 运行可能需要标准桌面 PC 持续处理 3-5 天,因此可能需要专用的高性能计算资源。VDJServer 通过图形用户界面(Christley 等人,Front Immunol 9:976, 2018)在德克萨斯高级计算中心(TACC)提供对高性能计算(HPC)的免费访问。VDJServer 是一个基于云的免疫受体序列数据分析门户,提供了一整套工具套件,用于完成分析工作流程,包括用于序列读取的预处理和质量控制、V(D)J 基因分配、受体特征描述和受体比较的模块。此外,VDJServer 为 IgBLAST 等工具提供了并行执行,因此随着输入数据的大小增长,可以利用更多的计算资源。在桌面 PC 上需要数天的分析可能只需要在 VDJServer 上几个小时。VDJServer 是一个免费的、公开可用的、开源许可的资源。在这里,我们描述了在 VDJServer 的高性能计算上执行免疫受体分析的工作流程。

相似文献

6
Data Sharing and Reuse: A Method by the AIRR Community.数据共享和再利用:AIRR 社区的方法。
Methods Mol Biol. 2022;2453:447-476. doi: 10.1007/978-1-0716-2115-8_23.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验