Genome Sciences Centre, Vancouver, BC V5Z 4S6, Canada.
Bioinformatics. 2009 Nov 1;25(21):2872-7. doi: 10.1093/bioinformatics/btp367. Epub 2009 Jun 15.
Whole transcriptome shotgun sequencing data from non-normalized samples offer unique opportunities to study the metabolic states of organisms. One can deduce gene expression levels using sequence coverage as a surrogate, identify coding changes or discover novel isoforms or transcripts. Especially for discovery of novel events, de novo assembly of transcriptomes is desirable.
Transcriptome from tumor tissue of a patient with follicular lymphoma was sequenced with 36 base pair (bp) single- and paired-end reads on the Illumina Genome Analyzer II platform. We assembled approximately 194 million reads using ABySS into 66 921 contigs 100 bp or longer, with a maximum contig length of 10 951 bp, representing over 30 million base pairs of unique transcriptome sequence, or roughly 1% of the genome.
Source code and binaries of ABySS are freely available for download at http://www.bcgsc.ca/platform/bioinfo/software/abyss. Assembler tool is implemented in C++. The parallel version uses Open MPI. ABySS-Explorer tool is implemented in Java using the Java universal network/graph framework.
来自非标准化样本的全转录组鸟枪法测序数据为研究生物体的代谢状态提供了独特的机会。人们可以使用序列覆盖作为替代物来推断基因表达水平,识别编码变化或发现新的同工型或转录本。特别是对于新事件的发现,转录组的从头组装是理想的。
对滤泡性淋巴瘤患者的肿瘤组织进行了转录组测序,在 Illumina Genome Analyzer II 平台上使用 36 个碱基(bp)的单端和配对末端读取进行测序。我们使用 ABySS 将大约 1.94 亿个读取组装成 66921 个 100bp 或更长的 contigs,最大 contig 长度为 10951bp,代表超过 3000 万个独特转录组序列的碱基对,或大约占基因组的 1%。
AByss 的源代码和二进制文件可在 http://www.bcgsc.ca/platform/bioinfo/software/abyss 上免费下载。组装工具是用 C++实现的。并行版本使用 OpenMPI。AByss-Explorer 工具是用 Java 编写的,使用 Java 通用网络/图形框架。