a Department of Computer Science & Technology , Heilongjiang University , Harbin , China.
b Shandong Aerospace Institute of Electronic Technology , Yantai , China.
Bioengineered. 2017 Nov 2;8(6):750-758. doi: 10.1080/21655979.2017.1373538. Epub 2017 Sep 21.
Gene splicing is the process of assembling a large number of unordered short sequence fragments to the original genome sequence as accurately as possible. Several popular splicing algorithms based on reads are reviewed in this article, including reference genome algorithms and de novo splicing algorithms (Greedy-extension, Overlap-Layout-Consensus graph, De Bruijn graph). We also discuss a new splicing method based on the MapReduce strategy and Hadoop. By comparing these algorithms, some conclusions are drawn and some suggestions on gene splicing research are made.
基因拼接是将大量无序的短序列片段尽可能准确地组装到原始基因组序列中的过程。本文综述了几种基于读取的流行拼接算法,包括参考基因组算法和从头拼接算法(贪婪扩展、重叠布局共识图、De Bruijn 图)。我们还讨论了一种基于 MapReduce 策略和 Hadoop 的新拼接方法。通过比较这些算法,得出了一些结论,并对基因拼接研究提出了一些建议。