Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA.
Genome Res. 2010 Dec;20(12):1730-9. doi: 10.1101/gr.108217.110. Epub 2010 Nov 2.
We present a powerful application of ultra high-throughput sequencing, SAGE-Seq, for the accurate quantification of normal and neoplastic mammary epithelial cell transcriptomes. We develop data analysis pipelines that allow the mapping of sense and antisense strands of mitochondrial and RefSeq genes, the normalization between libraries, and the identification of differentially expressed genes. We find that the diversity of cancer transcriptomes is significantly higher than that of normal cells. Our analysis indicates that transcript discovery plateaus at 10 million reads/sample, and suggests a minimum desired sequencing depth around five million reads. Comparison of SAGE-Seq and traditional SAGE on normal and cancerous breast tissues reveals higher sensitivity of SAGE-Seq to detect less-abundant genes, including those encoding for known breast cancer-related transcription factors and G protein-coupled receptors (GPCRs). SAGE-Seq is able to identify genes and pathways abnormally activated in breast cancer that traditional SAGE failed to call. SAGE-Seq is a powerful method for the identification of biomarkers and therapeutic targets in human disease.
我们展示了超高通量测序(SAGE-Seq)在准确量化正常和肿瘤乳腺上皮细胞转录组方面的强大应用。我们开发了数据分析管道,允许对线粒体和 RefSeq 基因的有义和反义链进行映射、文库之间的标准化以及差异表达基因的鉴定。我们发现,癌症转录组的多样性明显高于正常细胞。我们的分析表明,转录组的发现随测序读段数量的增加而趋于平稳,并且建议测序深度至少达到五百万读段。SAGE-Seq 与正常和癌组织的传统 SAGE 比较表明,SAGE-Seq 对检测丰度较低的基因(包括已知的乳腺癌相关转录因子和 G 蛋白偶联受体(GPCR))的灵敏度更高。SAGE-Seq 能够识别在乳腺癌中异常激活的基因和途径,而传统 SAGE 未能检测到这些基因和途径。SAGE-Seq 是一种用于鉴定人类疾病生物标志物和治疗靶点的强大方法。