Department of Pediatrics, Philipps-University Marburg Marburg, Germany.
Front Immunol. 2012 Jun 28;3:176. doi: 10.3389/fimmu.2012.00176. eCollection 2012.
Sequence analysis of immunoglobulin (Ig) heavy and light chain transcripts can refine categorization of B cell subpopulations and can shed light on the selective forces that act during immune responses or immune dysregulation, such as autoimmunity, allergy, and B cell malignancy. High-throughput sequencing yields Ig transcript collections of unprecedented size. The authoritative web-based IMGT/HighV-QUEST program is capable of analyzing large collections of transcripts and provides annotated output files to describe many key properties of Ig transcripts. However, additional processing of these flat files is required to create figures, or to facilitate analysis of additional features and comparisons between sequence sets. We present an easy-to-use Microsoft(®) Excel(®) based software, named Immunoglobulin Analysis Tool (IgAT), for the summary, interrogation, and further processing of IMGT/HighV-QUEST output files. IgAT generates descriptive statistics and high-quality figures for collections of murine or human Ig heavy or light chain transcripts ranging from 1 to 150,000 sequences. In addition to traditionally studied properties of Ig transcripts - such as the usage of germline gene segments, or the length and composition of the CDR-3 region - IgAT also uses published algorithms to calculate the probability of antigen selection based on somatic mutational patterns, the average hydrophobicity of the antigen-binding sites, and predictable structural properties of the CDR-H3 loop according to Shirai's H3-rules. These refined analyses provide in-depth information about the selective forces acting upon Ig repertoires and allow the statistical and graphical comparison of two or more sequence sets. IgAT is easy to use on any computer running Excel(®) 2003 or higher. Thus, IgAT is a useful tool to gain insights into the selective forces and functional properties of small to extremely large collections of Ig transcripts, thereby assisting a researcher to mine a data set to its fullest.
免疫球蛋白 (Ig) 重链和轻链转录本的序列分析可以细化 B 细胞亚群的分类,并揭示在免疫反应或免疫失调(如自身免疫、过敏和 B 细胞恶性肿瘤)过程中起作用的选择压力。高通量测序产生了前所未有的大规模 Ig 转录本集合。基于网络的权威 IMGT/HighV-QUEST 程序能够分析大量的转录本,并提供带注释的输出文件,以描述 Ig 转录本的许多关键特性。然而,为了创建图形或促进对附加特征的分析以及对序列集之间的比较,需要对这些平面文件进行额外的处理。我们介绍了一种易于使用的基于 Microsoft(®)Excel(®)的软件,名为免疫球蛋白分析工具 (IgAT),用于汇总、询问和进一步处理 IMGT/HighV-QUEST 输出文件。IgAT 为范围从 1 到 150,000 个序列的鼠或人 Ig 重链或轻链转录本集合生成描述性统计信息和高质量图形。除了传统上研究 Ig 转录本的特性 - 例如种系基因片段的使用、CDR-3 区域的长度和组成 - IgAT 还使用已发表的算法根据体细胞突变模式计算抗原选择的概率、抗原结合位点的平均疏水性以及根据 Shirai 的 H3 规则预测 CDR-H3 环的可预测结构特性。这些细化分析提供了有关作用于 Ig 库的选择压力的深入信息,并允许对两个或更多序列集进行统计和图形比较。IgAT 可在运行 Excel(®)2003 或更高版本的任何计算机上轻松使用。因此,IgAT 是一种有用的工具,可以深入了解小至非常大规模的 Ig 转录本集合的选择压力和功能特性,从而帮助研究人员充分挖掘数据集。