SA-SSR:一种基于后缀数组的算法,用于在大型基因序列中全面高效地发现简单重复序列(SSR)
SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences.
作者信息
Pickett B D, Karlinsey S M, Penrod C E, Cormier M J, Ebbert M T W, Shiozawa D K, Whipple C J, Ridge P G
机构信息
Department of Biology, Brigham Young University, Provo, UT 84602, USA.
出版信息
Bioinformatics. 2016 Sep 1;32(17):2707-9. doi: 10.1093/bioinformatics/btw298. Epub 2016 May 11.
UNLABELLED
Simple Sequence Repeats (SSRs) are used to address a variety of research questions in a variety of fields (e.g. population genetics, phylogenetics, forensics, etc.), due to their high mutability within and between species. Here, we present an innovative algorithm, SA-SSR, based on suffix and longest common prefix arrays for efficiently detecting SSRs in large sets of sequences. Existing SSR detection applications are hampered by one or more limitations (i.e. speed, accuracy, ease-of-use, etc.). Our algorithm addresses these challenges while being the most comprehensive and correct SSR detection software available. SA-SSR is 100% accurate and detected >1000 more SSRs than the second best algorithm, while offering greater control to the user than any existing software.
AVAILABILITY AND IMPLEMENTATION
SA-SSR is freely available at http://github.com/ridgelab/SA-SSR CONTACT: perry.ridge@byu.edu
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
未标注
简单序列重复(SSRs)因其在物种内部和物种之间具有高度变异性,被用于解决各个领域的各种研究问题(如群体遗传学、系统发育学、法医学等)。在此,我们提出一种基于后缀和最长公共前缀数组的创新算法SA-SSR,用于在大量序列中高效检测简单序列重复。现有的简单序列重复检测应用受到一个或多个限制(即速度、准确性、易用性等)的阻碍。我们的算法在解决这些挑战的同时,是现有最全面且正确的简单序列重复检测软件。SA-SSR的准确率为100%,比第二好的算法多检测出1000多个简单序列重复,同时为用户提供了比任何现有软件更大的控制权。
可用性与实现
SA-SSR可在http://github.com/ridgelab/SA-SSR上免费获取。
联系方式
补充信息
补充数据可在《生物信息学》在线版获取。
相似文献
Bioinformatics. 2017-12-15
Nucleic Acids Res. 2006-7-1
Methods Mol Biol. 2009
Bioinformatics. 2004-6-12
Nucleic Acids Res. 2017-6-2
BMC Bioinformatics. 2007-11-29
BMC Genomics. 2016-12-22
引用本文的文献
Plants (Basel). 2024-9-19
Bioinformatics. 2017-12-15
本文引用的文献
Methods Mol Biol. 2013
Brief Bioinform. 2012-5-29
Nucleic Acids Res. 1992-1-25