Pommié Christelle, Levadoux Séverine, Sabatier Robert, Lefranc Gérard, Lefranc Marie-Paule
IMGT, Laboratoire d'ImmunoGénétique Moléculaire LIGM, Université Montpellier II, UPR CNRS 1142 Institut de Génétique Humaine IGH, 141 rue de la Cardonille, F-34396 Montpellier Cedex 5, France.
J Mol Recognit. 2004 Jan-Feb;17(1):17-32. doi: 10.1002/jmr.647.
IMGT, the international ImMunoGeneTics information system(R) (http://imgt.cines.fr) is a high-quality integrated information system specializing in immunoglobulins (IG), T cell receptors (TR) and major histocompatibility complex (MHC) of human and other vertebrates. IMGT comprises IMGT/LIGM-DB, the comprehensive database of IG and TR sequences from human and other vertebrates (76 846 sequences in September 2003). In order to define the IMGT criteria necessary for standardized statistical analyses, the sequences of the IG variable regions (V-REGIONs) from productively rearranged human IG heavy (IGH) and IG light kappa (IGK) and lambda (IGL) chains were extracted from IMGT/LIGM-DB. The framework amino acid positions of 2474 V-REGIONs (1360 IGHV, 585 IGKV, 529 IGLV) were numbered according to the IMGT unique numbering. Two statistical methods (correspondence analysis and hierarchic classification) were used to analyze the 237 framework positions (80 for IGHV, 79 for IGKV, 78 for IGLV), for three properties (hydropathy, volume and chemical characteristics) of the 20 common amino acids. Results of the analyses are shown as standardized two-dimensional representations, designated as IMGT Colliers de Perles statistical profiles. They provide a characterization of the amino acid properties at each framework position of the expressed IG V-REGIONs, and a visualization of the resemblances and differences between heavy and light, and between kappa and lambda sequences. The standardized criteria defined in this paper, amino acid positions and property classes, will be useful to study the mutations and allele polymorphisms, to establish correlations between amino acids in the IG and TR protein three-dimensional structures and to extract new knowledge from V-like domains of chains, other than IG and TR, belonging to the immunoglobulin superfamily.
国际免疫基因信息系统(IMGT)(http://imgt.cines.fr)是一个高质量的综合信息系统,专门研究人类和其他脊椎动物的免疫球蛋白(IG)、T细胞受体(TR)以及主要组织相容性复合体(MHC)。IMGT包括IMGT/LIGM-DB,即来自人类和其他脊椎动物的IG和TR序列的综合数据库(2003年9月有76846个序列)。为了确定标准化统计分析所需的IMGT标准,从IMGT/LIGM-DB中提取了有效重排的人类IG重链(IGH)、IG轻链kappa(IGK)和lambda(IGL)链的IG可变区(V-REGIONs)序列。根据IMGT独特编号对2474个V-REGIONs(1360个IGHV、585个IGKV、529个IGLV)的框架氨基酸位置进行编号。使用两种统计方法(对应分析和层次分类)分析237个框架位置(IGHV为80个、IGKV为79个、IGLV为78个),针对20种常见氨基酸的三种特性(亲水性、体积和化学特性)。分析结果以标准化二维表示形式呈现,称为IMGT珍珠项链统计图谱。它们提供了表达的IG V-REGIONs每个框架位置氨基酸特性的表征,以及重链和轻链之间、kappa链和lambda链序列之间异同的可视化。本文定义的标准化标准、氨基酸位置和特性类别,将有助于研究突变和等位基因多态性,建立IG和TR蛋白质三维结构中氨基酸之间的相关性,以及从属于免疫球蛋白超家族的IG和TR之外的链的V样结构域中提取新知识。