Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts, USA.
Nat Biotechnol. 2011 May 15;29(7):644-52. doi: 10.1038/nbt.1883.
Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.
cDNA 的大规模平行测序使人们能够深入、有效地探测转录组。目前,从这些数据中进行转录本重建的方法通常依赖于将读取序列与参考基因组进行比对,因此不适用于部分或完全缺乏参考基因组的样本。在这里,我们提出了一种从头组装全长转录本的 Trinity 方法,并在裂殖酵母、小鼠和粉虱的样本中对其进行了评估,这些样本的参考基因组尚不可用。通过有效地构建和分析一系列 de Bruijn 图,Trinity 可以完全重建包括可变剪接异构体和最近复制基因的转录本在内的大量转录本。与其他从头转录组组装程序相比,Trinity 在广泛的表达水平范围内恢复了更多的全长转录本,其灵敏度与依赖于基因组比对的方法相似。我们的方法为任何样本中的转录组重建提供了一个统一的解决方案,特别是在缺乏参考基因组的情况下。