Suppr超能文献

AIRR-C IG 参考集:经过精心挑选的免疫球蛋白重链和轻链种系基因集。

AIRR-C IG Reference Sets: curated sets of immunoglobulin heavy and light chain germline genes.

机构信息

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, Australia.

Department of Immunotechnology, and SciLifeLab, Lund University, Lund, Sweden.

出版信息

Front Immunol. 2024 Feb 9;14:1330153. doi: 10.3389/fimmu.2023.1330153. eCollection 2023.

Abstract

INTRODUCTION

Analysis of an individual's immunoglobulin (IG) gene repertoire requires the use of high-quality germline gene reference sets. When sets only contain alleles supported by strong evidence, AIRR sequencing (AIRR-seq) data analysis is more accurate and studies of the evolution of IG genes, their allelic variants and the expressed immune repertoire is therefore facilitated.

METHODS

The Adaptive Immune Receptor Repertoire Community (AIRR-C) IG Reference Sets have been developed by including only human IG heavy and light chain alleles that have been confirmed by evidence from multiple high-quality sources. To further improve AIRR-seq analysis, some alleles have been extended to deal with short 3' or 5' truncations that can lead them to be overlooked by alignment utilities. To avoid other challenges for analysis programs, exact paralogs (e.g. IGHV1-6901 and IGHV1-69D01) are only represented once in each set, though alternative sequence names are noted in accompanying metadata.

RESULTS AND DISCUSSION

The Reference Sets include less than half the previously recognised IG alleles (e.g. just 198 IGHV sequences), and also include a number of novel alleles: 8 IGHV alleles, 2 IGKV alleles and 5 IGLV alleles. Despite their smaller sizes, erroneous calls were eliminated, and excellent coverage was achieved when a set of repertoires comprising over 4 million V(D)J rearrangements from 99 individuals were analyzed using the Sets. The version-tracked AIRR-C IG Reference Sets are freely available at the OGRDB website (https://ogrdb.airr-community.org/germline_sets/Human) and will be regularly updated to include newly observed and previously reported sequences that can be confirmed by new high-quality data.

摘要

简介

分析个体的免疫球蛋白(IG)基因库需要使用高质量的种系基因参考集。当集合仅包含得到强有力证据支持的等位基因时,AIRR 测序(AIRR-seq)数据分析更加准确,并且有助于研究 IG 基因的进化、它们的等位变体和表达的免疫库。

方法

适应性免疫受体库(AIRR-C)IG 参考集是通过仅包括已被多个高质量来源的证据证实的人类 IG 重链和轻链等位基因来开发的。为了进一步提高 AIRR-seq 分析的准确性,一些等位基因已被扩展,以解决可能导致对齐实用程序忽略的短 3' 或 5' 截断问题。为了避免分析程序的其他挑战,每个集合中仅代表一次 exact paralogs(例如 IGHV1-6901 和 IGHV1-69D01),尽管在相关元数据中注意到了替代序列名称。

结果与讨论

参考集包含的 IG 等位基因不到以前识别的一半(例如,只有 198 个 IGHV 序列),并且还包含一些新的等位基因:8 个 IGHV 等位基因、2 个 IGKV 等位基因和 5 个 IGLV 等位基因。尽管它们的尺寸较小,但消除了错误的调用,并且当使用该集合分析来自 99 个人的超过 400 万个 V(D)J 重排的一组库时,实现了出色的覆盖范围。版本跟踪的 AIRR-C IG 参考集可在 OGRDB 网站(https://ogrdb.airr-community.org/germline_sets/Human)上免费获得,并将定期更新,以包含新观察到的和以前报告的序列,这些序列可以通过新的高质量数据进行确认。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/72ea/10884231/d9c17e017c46/fimmu-14-1330153-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验