BGI-Shenzhen, Shenzhen Biodynamic Optical Imaging Center, Peking University, Beijing, China.
Bioinformatics. 2012 Jun 1;28(11):1533-5. doi: 10.1093/bioinformatics/bts187. Epub 2012 Apr 15.
The next-generation high-throughput sequencing technologies, especially from Illumina, have been widely used in re-sequencing and de novo assembly studies. However, there is no existing software that can simulate Illumina reads with real error and quality distributions and coverage bias yet, which is very useful in relevant software development and study designing of sequencing projects.
We provide a software package, pIRS (profile-based Illumina pair-end reads simulator), which simulates Illumina reads with empirical Base-Calling and GC%-depth profiles trained from real re-sequencing data. The error and quality distributions as well as coverage bias patterns of simulated reads using pIRS fit the properties of real sequencing data better than existing simulators. In addition, pIRS also comes with a tool to simulate the heterozygous diploid genomes.
pIRS is written in C++ and Perl, and is freely available at ftp://ftp.genomics.org.cn/pub/pIRS/.
下一代高通量测序技术,特别是来自 Illumina 的测序技术,已经被广泛应用于重测序和从头组装研究。然而,目前还没有能够模拟 Illumina 测序数据的真实错误和质量分布以及覆盖偏差的软件,这在相关测序项目的软件开发和研究设计中非常有用。
我们提供了一个软件包,pIRS(基于分布的 Illumina 双端测序 reads 模拟器),它使用从真实重测序数据中训练的经验碱基调用和 GC%-深度分布来模拟 Illumina 测序 reads。使用 pIRS 模拟的测序 reads 的错误和质量分布以及覆盖偏差模式比现有模拟器更符合真实测序数据的特性。此外,pIRS 还附带了一个模拟杂合二倍体基因组的工具。
pIRS 是用 C++和 Perl 编写的,可以从 ftp://ftp.genomics.org.cn/pub/pIRS/ 免费获得。