Lefrançois Philippe, Tetzlaff Michael T, Moreau Linda, Watters Andrew K, Netchiporouk Elena, Provost Nathalie, Gilbert Martin, Ni Xiao, Sasseville Denis, Duvic Madeleine, Litvinov Ivan V
Division of Dermatology, McGill University Health Centre, Montreal, QC, Canada.
Department of Pathology, Section of Dermatopathology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States.
Front Med (Lausanne). 2017 Sep 22;4:153. doi: 10.3389/fmed.2017.00153. eCollection 2017.
Cutaneous T-cell lymphomas (CTCLs) are a heterogeneous group of malignancies with courses ranging from indolent to potentially lethal. We recently studied in a 157 patient cohort gene expression profiles generated by the TruSeq targeted RNA gene expression sequencing. We observed that the sequencing library quality and depth from formalin-fixed paraffin-embedded (FFPE) skin samples were significantly lower when biopsies were obtained prior to 2009. We also observed that the fresh CTCL samples clustered together, even though they included stage I-IV disease. In this study, we compared TruSeq gene expression patterns in older (≤2008) vs. more recent (≥2009) FFPE samples to determine whether these clustering analyses and earlier described differentially expressed gene findings are robust when analyzed based on the year of biopsy. We also explored biases found in FFPE samples when subjected to the TruSeq analysis of gene expression. Our results showed that ≤2008 and ≥2009 samples clustered equally well to the full data set and, importantly, both analyses produced nearly identical trends and findings. Specifically, both analyses enriched nearly identical DEGs when comparing benign vs. (1) stage I-IV and (2) stage IV (alone) CTCL samples. Results obtained using either ≤2008 or ≥2009 samples were strongly correlated. Furthermore, by using subgroup analyses, we were able to identify additional novel differentially expressed genes (DEGs), which did not reach statistical significance in the prior full data set analysis. Those included CTCL-upregulated , and and CTCL-downregulated , and genes. With respect to sample biases, no matter if we performed subgroup analyses or full data set analysis, fresh samples tightly clustered together. While principal component analysis revealed that fresh samples were spatially closer together, indicating some preprocessing batch effect, they remained in the proximity to other normal/benign and FFPE CTCL samples and were not clustering as outliers by themselves. Notably, this did not affect the determination of DEGs when analyzing ≥2009 samples (fresh and FFPE biopsies) vs. ≥2009 FFPE samples alone.
皮肤T细胞淋巴瘤(CTCL)是一组异质性恶性肿瘤,病程从惰性到潜在致命不等。我们最近在一个157例患者的队列中研究了通过TruSeq靶向RNA基因表达测序生成的基因表达谱。我们观察到,在2009年之前获取活检样本时,福尔马林固定石蜡包埋(FFPE)皮肤样本的测序文库质量和深度显著更低。我们还观察到,新鲜的CTCL样本聚集在一起,尽管它们包括I-IV期疾病。在本研究中,我们比较了较旧(≤2008年)与较新(≥2009年)FFPE样本中的TruSeq基因表达模式,以确定基于活检年份进行分析时,这些聚类分析和先前描述的差异表达基因发现是否可靠。我们还探讨了在对FFPE样本进行TruSeq基因表达分析时发现的偏差。我们的结果表明,≤2008年和≥2009年的样本与完整数据集的聚类效果同样良好,重要的是,两种分析产生了几乎相同的趋势和发现。具体而言,在比较良性样本与(1)I-IV期和(2)单独的IV期CTCL样本时,两种分析富集了几乎相同的差异表达基因(DEG)。使用≤2008年或≥2009年样本获得的结果高度相关。此外,通过亚组分析,我们能够识别出额外的新型差异表达基因(DEG),这些基因在先前的完整数据集分析中未达到统计学显著性。其中包括CTCL上调的基因,以及CTCL下调的基因。关于样本偏差,无论我们进行亚组分析还是完整数据集分析,新鲜样本都紧密聚集在一起。虽然主成分分析表明新鲜样本在空间上更接近,表明存在一些预处理批次效应,但它们仍与其他正常/良性和FFPE CTCL样本接近,并且本身不会作为异常值聚类。值得注意的是,在分析≥2009年样本(新鲜和FFPE活检样本)与单独的≥2009年FFPE样本时,这并不影响差异表达基因的确定。