Konstantinovsky Thomas, Peres Ayelet, Eisenberg Ran, Polak Pazit, Lindenbaum Ofir, Yaari Gur
Department of Bioengineering, Faculty of Engineering, Bar Ilan University, 5290002 Ramat Gan, Israel.
Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002 Ramat Gan, Israel.
Nucleic Acids Res. 2025 Jul 8;53(13). doi: 10.1093/nar/gkaf651.
Sequence alignment of immunoglobulin (Ig) sequences is central to the computational analysis of adaptive immune receptor repertoire sequencing (AIRR-seq) data, impacting adaptive immunity research and antibody engineering. Traditional Ig sequence aligners often struggle to handle the complexities of V(D)J recombination and somatic hypermutation (SHM), resulting in suboptimal allele assignment accuracy and sequence segmentation. We introduce AlignAIR, a novel deep learning-based aligner that leverages advanced simulation approaches and a multi-task learning framework. AlignAIR sets new state-of-the-art results in allele assignment accuracy, productivity assessments, sequence segmentation, and speed. The model's latent space captures SHM characteristics, offering more profound insights into sequence variability. AlignAIR is designed for seamless integration with existing AIRR-seq pipelines and includes a user-friendly web interface and a container image for efficient local processing of millions of sequences. AlignAIR represents a significant advancement in immunogenetics research and antibody engineering, providing a critical resource for analyzing adaptive immune receptor repertoires.
免疫球蛋白(Ig)序列的比对是适应性免疫受体组库测序(AIRR-seq)数据计算分析的核心,对适应性免疫研究和抗体工程有重要影响。传统的Ig序列比对工具常常难以处理V(D)J重组和体细胞超突变(SHM)的复杂性,导致等位基因分配准确性和序列分割效果欠佳。我们推出了AlignAIR,这是一种基于深度学习的新型比对工具,它利用了先进的模拟方法和多任务学习框架。AlignAIR在等位基因分配准确性、生产力评估、序列分割和速度方面创造了新的最先进成果。该模型的潜在空间捕捉到了SHM特征,能更深入地洞察序列变异性。AlignAIR旨在与现有的AIRR-seq流程无缝集成,包括一个用户友好的网页界面和一个容器镜像,用于对数以百万计的序列进行高效的本地处理。AlignAIR代表了免疫遗传学研究和抗体工程的重大进展,为分析适应性免疫受体组库提供了关键资源。