Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058 Basel, Switzerland.
BMC Bioinformatics. 2011 Apr 26;12:119. doi: 10.1186/1471-2105-12-119.
With next-generation sequencing technologies, experiments that were considered prohibitive only a few years ago are now possible. However, while these technologies have the ability to produce enormous volumes of data, the sequence reads are prone to error. This poses fundamental hurdles when genetic diversity is investigated.
We developed ShoRAH, a computational method for quantifying genetic diversity in a mixed sample and for identifying the individual clones in the population, while accounting for sequencing errors. The software was run on simulated data and on real data obtained in wet lab experiments to assess its reliability.
ShoRAH is implemented in C++, Python, and Perl and has been tested under Linux and Mac OS X. Source code is available under the GNU General Public License at http://www.cbg.ethz.ch/software/shorah.
随着下一代测序技术的发展,几年前被认为是不可行的实验现在已经成为可能。然而,尽管这些技术能够产生大量的数据,但序列读取容易出错。当研究遗传多样性时,这就构成了根本性的障碍。
我们开发了 ShoRAH,这是一种用于量化混合样本中遗传多样性并识别群体中个体克隆的计算方法,同时考虑到测序错误。该软件在模拟数据和在湿实验室实验中获得的真实数据上运行,以评估其可靠性。
ShoRAH 是用 C++、Python 和 Perl 编写的,并在 Linux 和 Mac OS X 下进行了测试。源代码可在 http://www.cbg.ethz.ch/software/shorah 下根据 GNU 通用公共许可证获得。