Life Technologies, Foster City, California, United States of America.
PLoS Comput Biol. 2012;8(4):e1002464. doi: 10.1371/journal.pcbi.1002464. Epub 2012 Apr 5.
High-throughput RNA sequencing enables quantification of transcripts (both known and novel), exon/exon junctions and fusions of exons from different genes. Discovery of gene fusions-particularly those expressed with low abundance- is a challenge with short- and medium-length sequencing reads. To address this challenge, we implemented an RNA-Seq mapping pipeline within the LifeScope software. We introduced new features including filter and junction mapping, annotation-aided pairing rescue and accurate mapping quality values. We combined this pipeline with a Suffix Array Spliced Read (SASR) aligner to detect chimeric transcripts. Performing paired-end RNA-Seq of the breast cancer cell line MCF-7 using the SOLiD system, we called 40 gene fusions among over 120,000 splicing junctions. We validated 36 of these 40 fusions with TaqMan assays, of which 25 were expressed in MCF-7 but not the Human Brain Reference. An intra-chromosomal gene fusion involving the estrogen receptor alpha gene ESR1, and another involving the RPS6KB1 (Ribosomal protein S6 kinase beta-1) were recurrently expressed in a number of breast tumor cell lines and a clinical tumor sample.
高通量 RNA 测序能够定量转录本(已知和新的)、外显子/外显子接头和来自不同基因的外显子融合。使用短和中等长度测序reads 发现基因融合(特别是那些低丰度表达的)是一个挑战。为了解决这个挑战,我们在 LifeScope 软件中实现了一个 RNA-Seq 映射管道。我们引入了新的功能,包括过滤和接头映射、注释辅助配对恢复和准确的映射质量值。我们将这个管道与后缀数组拼接读取(SASR)对齐器结合起来,以检测嵌合转录本。使用 SOLiD 系统对乳腺癌细胞系 MCF-7 进行配对末端 RNA-Seq,我们在超过 120,000 个拼接接头中调用了 40 个基因融合。我们使用 TaqMan 检测法验证了这 40 个融合中的 36 个,其中 25 个在 MCF-7 中表达,但不在人脑中表达。涉及雌激素受体 alpha 基因 ESR1 的染色体内基因融合和另一个涉及 RPS6KB1(核糖体蛋白 S6 激酶 beta-1)的基因融合在许多乳腺癌细胞系和一个临床肿瘤样本中反复表达。