Suppr超能文献

基于聚类的克隆相关免疫球蛋白基因序列集的鉴定。

Clustering-based identification of clonally-related immunoglobulin gene sequence sets.

作者信息

Chen Zhiliang, Collins Andrew M, Wang Yan, Gaëta Bruno A

机构信息

School of Computer Science and Engineering, University of New South Wales, NSW 2052, Australia.

出版信息

Immunome Res. 2010 Sep 27;6 Suppl 1(Suppl 1):S4. doi: 10.1186/1745-7580-6-S1-S4.

Abstract

BACKGROUND

Clonal expansion of B lymphocytes coupled with somatic mutation and antigen selection allow the mammalian humoral immune system to generate highly specific immunoglobulins (IG) or antibodies against invading bacteria, viruses and toxins. The availability of high-throughput DNA sequencing methods is providing new avenues for studying this clonal expansion and identifying the factors guiding the generation of antibodies. The identification of groups of rearranged immunoglobulin gene sequences descended from the same rearrangement (clonally-related sets) in very large sets of sequences is facilitated by the availability of immunoglobulin gene sequence alignment and partitioning software that can accurately predict component germline gene, but has required painstaking visual inspection and analysis of sequences.

RESULTS

We have developed and implemented an algorithm for identifying sets of clonally-related sequences in large human immunoglobulin heavy chain gene variable region sequence sets. The program processes sequences that have been partitioned using iHMMune-align, and uses pairwise comparisons of CDR3 sequences and similarity in IGHV and IGHJ germline gene assignments to construct a distance matrix. Agglomerative hierarchical clustering is then used to identify likely groups of clonally-related sequences. The program is available for download from http://www.cse.unsw.edu.au/~ihmmune/ClonalRelate/ClonalRelate.zip.

CONCLUSIONS

The method was evaluated on several benchmark datasets and provided a more accurate and considerably faster identification of clonally-related immunoglobulin gene sequences than visual inspection by domain experts.

摘要

背景

B淋巴细胞的克隆性扩增,连同体细胞突变和抗原选择,使得哺乳动物的体液免疫系统能够产生针对入侵细菌、病毒和毒素的高度特异性免疫球蛋白(IG)或抗体。高通量DNA测序方法的出现为研究这种克隆性扩增以及鉴定指导抗体产生的因素提供了新途径。免疫球蛋白基因序列比对和分区软件的可用性有助于在非常大的序列集中识别源自相同重排(克隆相关集)的重排免疫球蛋白基因序列组,该软件可以准确预测组成性种系基因,但需要对序列进行艰苦的目视检查和分析。

结果

我们开发并实施了一种算法,用于在大型人类免疫球蛋白重链基因可变区序列集中识别克隆相关序列集。该程序处理使用iHMMune-align进行分区的序列,并使用CDR3序列的成对比较以及IGHV和IGHJ种系基因分配中的相似性来构建距离矩阵。然后使用凝聚层次聚类来识别可能的克隆相关序列组。该程序可从http://www.cse.unsw.edu.au/~ihmmune/ClonalRelate/ClonalRelate.zip下载。

结论

该方法在几个基准数据集上进行了评估,与领域专家的目视检查相比,它能更准确且更快地识别克隆相关的免疫球蛋白基因序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2d29/2946782/98f7a47f0ee3/1745-7580-6-S1-S4-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验