Splign：用于计算剪接比对并识别旁系同源物的算法。

Splign: algorithms for computing spliced alignments with identification of paralogs.

作者信息

Kapustin Yuri, Souvorov Alexander, Tatusova Tatiana, Lipman David

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20814, USA.

出版信息

Biol Direct. 2008 May 21;3:20. doi: 10.1186/1745-6150-3-20.

DOI:10.1186/1745-6150-3-20

PMID:18495041

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2440734/

Abstract

BACKGROUND

The computation of accurate alignments of cDNA sequences against a genome is at the foundation of modern genome annotation pipelines. Several factors such as presence of paralogs, small exons, non-consensus splice signals, sequencing errors and polymorphic sites pose recognized difficulties to existing spliced alignment algorithms.

RESULTS

We describe a set of algorithms behind a tool called Splign for computing cDNA-to-Genome alignments. The algorithms include a high-performance preliminary alignment, a compartment identification based on a formally defined model of adjacent duplicated regions, and a refined sequence alignment. In a series of tests, Splign has produced more accurate results than other tools commonly used to compute spliced alignments, in a reasonable amount of time.

CONCLUSION

Splign's ability to deal with various issues complicating the spliced alignment problem makes it a helpful tool in eukaryotic genome annotation processes and alternative splicing studies. Its performance is enough to align the largest currently available pools of cDNA data such as the human EST set on a moderate-sized computing cluster in a matter of hours. The duplications identification (compartmentization) algorithm can be used independently in other areas such as the study of pseudogenes.

摘要

背景

将cDNA序列与基因组进行精确比对的计算是现代基因组注释流程的基础。诸如旁系同源物的存在、小外显子、非一致性剪接信号、测序错误和多态性位点等多种因素给现有的剪接比对算法带来了公认的困难。

结果

我们描述了一种名为Splign的工具背后的一组算法，用于计算cDNA到基因组的比对。这些算法包括高性能的初步比对、基于正式定义的相邻重复区域模型的区间识别以及精细的序列比对。在一系列测试中，Splign在合理的时间内产生了比其他常用于计算剪接比对的工具更准确的结果。

结论

Splign处理使剪接比对问题复杂化的各种问题的能力使其成为真核生物基因组注释过程和可变剪接研究中的有用工具。它的性能足以在几小时内在中等规模的计算集群上对目前可用的最大cDNA数据集（如人类EST集）进行比对。重复序列识别（区间划分）算法可独立用于其他领域，如假基因研究。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

Splign：用于计算剪接比对并识别旁系同源物的算法。

Splign: algorithms for computing spliced alignments with identification of paralogs.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

Splign：用于计算剪接比对并识别旁系同源物的算法。

Splign: algorithms for computing spliced alignments with identification of paralogs.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献