Suppr超能文献

HTJoinSolver:利用由保守基序约束的近似动态规划进行人类免疫球蛋白VDJ分区。

HTJoinSolver: Human immunoglobulin VDJ partitioning using approximate dynamic programming constrained by conserved motifs.

作者信息

Russ Daniel E, Ho Kwan-Yuet, Longo Nancy S

机构信息

Division of Computational Bioscience, Center for Information Technology, NIH, 12 South Drive, Bethesda, MD, 20892, USA.

Vaccine Research Center, National Institute of Allergy and Infectious Diseases, NIH, 40 Convent Drive, Bethesda, MD, 20892, USA.

出版信息

BMC Bioinformatics. 2015 May 23;16(1):170. doi: 10.1186/s12859-015-0589-x.

Abstract

BACKGROUND

Partitioning the human immunoglobulin variable region into variable (V), diversity (D), and joining (J) segments is a common sequence analysis step. We introduce a novel approximate dynamic programming method that uses conserved immunoglobulin gene motifs to improve performance of aligning V-segments of rearranged immunoglobulin (Ig) genes. Our new algorithm enhances the former JOINSOLVER algorithm by processing sequences with insertions and/or deletions (indels) and improves the efficiency for large datasets provided by high throughput sequencing.

RESULTS

In our simulations, which include rearrangements with indels, the V-matching success rate improved from 61% for partial alignments of sequences with indels in the original algorithm to over 99% in the approximate algorithm. An improvement in the alignment of human VDJ rearrangements over the initial JOINSOLVER algorithm was also seen when compared to the Stanford.S22 human Ig dataset with an online VDJ partitioning software evaluation tool.

CONCLUSIONS

HTJoinSolver can rapidly identify V- and J-segments with indels to high accuracy for mutated sequences when the mutation probability is around 30% and 20% respectively. The D-segment is much harder to fit even at 20% mutation probability. For all segments, the probability of correctly matching V, D, and J increases with our alignment score.

摘要

背景

将人类免疫球蛋白可变区划分为可变(V)、多样(D)和连接(J)片段是常见的序列分析步骤。我们引入了一种新颖的近似动态规划方法,该方法利用保守的免疫球蛋白基因基序来提高重排免疫球蛋白(Ig)基因V片段比对的性能。我们的新算法通过处理带有插入和/或缺失(indels)的序列改进了之前的JOINSOLVER算法,并提高了高通量测序提供的大型数据集的效率。

结果

在我们的模拟中,包括带有indels的重排,V匹配成功率从原始算法中带有indels序列部分比对的61%提高到近似算法中的99%以上。与使用在线VDJ分区软件评估工具的斯坦福S22人类Ig数据集相比,在人类VDJ重排比最初的JOINSOLVER算法方面也有改进。

结论

当突变概率分别约为30%和20%时,HTJoinSolver可以快速且高精度地识别带有indels的V和J片段以用于突变序列。即使在20%的突变概率下,D片段也更难匹配。对于所有片段,正确匹配V、D和J的概率随着我们的比对分数增加。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88ca/4492005/bb32063f25f1/12859_2015_589_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验