IMGT®, the international ImMunoGeneTics information system®, Laboratoire d'ImmunoGénétique Moléculaire LIGM, Institut de Génétique Humaine IGH, UMR 9002, CNRS, Montpellier University, Montpellier, France.
BMC Immunol. 2017 Jun 26;18(1):35. doi: 10.1186/s12865-017-0218-8.
IMGT®, the international ImMunoGeneTics information system® ( http://www.imgt.org ), was created in 1989 in Montpellier, France (CNRS and Montpellier University) to manage the huge and complex diversity of the antigen receptors, and is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. Immunoglobulins (IG) or antibodies and T cell receptors (TR) are managed and described in the IMGT® databases and tools at the level of receptor, chain and domain. The analysis of the IG and TR variable (V) domain rearranged nucleotide sequences is performed by IMGT/V-QUEST (online since 1997, 50 sequences per batch) and, for next generation sequencing (NGS), by IMGT/HighV-QUEST, the high throughput version of IMGT/V-QUEST (portal begun in 2010, 500,000 sequences per batch). In vitro combinatorial libraries of engineered antibody single chain Fragment variable (scFv) which mimic the in vivo natural diversity of the immune adaptive responses are extensively screened for the discovery of novel antigen binding specificities. However the analysis of NGS full length scFv (~850 bp) represents a challenge as they contain two V domains connected by a linker and there is no tool for the analysis of two V domains in a single chain.
The functionality "Analyis of single chain Fragment variable (scFv)" has been implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST for the analysis of the two V domains of IG and TR scFv. It proceeds in five steps: search for a first closest V-REGION, full characterization of the first V-(D)-J-REGION, then search for a second V-REGION and full characterization of the second V-(D)-J-REGION, and finally linker delimitation.
For each sequence or NGS read, positions of the 5'V-DOMAIN, linker and 3'V-DOMAIN in the scFv are provided in the 'V-orientated' sense. Each V-DOMAIN is fully characterized (gene identification, sequence description, junction analysis, characterization of mutations and amino changes). The functionality is generic and can analyse any IG or TR single chain nucleotide sequence containing two V domains, provided that the corresponding species IMGT reference directory is available.
The "Analysis of single chain Fragment variable (scFv)" implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST provides the identification and full characterization of the two V domains of full-length scFv (~850 bp) nucleotide sequences from combinatorial libraries. The analysis can also be performed on concatenated paired chains of expressed antigen receptor IG or TR repertoires.
IMGT®,国际免疫遗传学信息系统®(http://www.imgt.org),于 1989 年在法国蒙彼利埃(CNRS 和蒙彼利埃大学)创建,旨在管理抗原受体的巨大而复杂的多样性,是免疫信息学的起源,这是一门位于免疫遗传学和生物信息学之间的科学。免疫球蛋白(IG)或抗体和 T 细胞受体(TR)在 IMGT®数据库和工具中按受体、链和域进行管理和描述。IG 和 TR 可变(V)域重排核苷酸序列的分析通过 IMGT/V-QUEST 进行(自 1997 年在线以来,每次批处理 50 个序列),对于下一代测序(NGS),通过 IMGT/HighV-QUEST 进行,这是 IMGT/V-QUEST 的高通量版本(门户于 2010 年开始,每次批处理 500,000 个序列)。体外工程抗体单链片段可变(scFv)的组合文库模拟了免疫适应性反应的体内自然多样性,广泛用于发现新的抗原结合特异性。然而,对 NGS 全长 scFv(~850 bp)的分析是一个挑战,因为它们包含由接头连接的两个 V 结构域,并且没有用于在单个链中分析两个 V 结构域的工具。
“单链片段可变(scFv)分析”功能已在 IMGT/V-QUEST 中实现,对于 NGS,在 IMGT/HighV-QUEST 中也可用于分析 IG 和 TR scFv 的两个 V 结构域。它分五步进行:搜索第一个最近的 V-REGION,完全描述第一个 V-(D)-J-REGION,然后搜索第二个 V-REGION 并完全描述第二个 V-(D)-J-REGION,最后是接头的限定。
对于每个序列或 NGS 读取,scFv 中的 5'V-DOMAIN、接头和 3'V-DOMAIN 的位置以“V 定向”的方式提供。每个 V-DOMAIN 都进行了全面描述(基因识别、序列描述、连接分析、突变和氨基酸变化的特征)。该功能是通用的,可以分析任何包含两个 V 结构域的 IG 或 TR 单链核苷酸序列,前提是可用相应物种的 IMGT 参考目录。
IMGT/V-QUEST 中实现的“单链片段可变(scFv)分析”以及用于 NGS 的 IMGT/HighV-QUEST 提供了对组合文库中全长 scFv(~850 bp)核苷酸序列的两个 V 结构域的鉴定和全面描述。该分析还可以在表达的抗原受体 IG 或 TR 库的串联配对链上进行。