Department of Biostatistics and Computational Biology and Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts 02115, USA.
Clin Cancer Res. 2012 Nov 15;18(22):6136-46. doi: 10.1158/1078-0432.CCR-12-1915. Epub 2012 Nov 7.
More than 20 million archival tissue samples are stored annually in the United States as formalin-fixed, paraffin-embedded (FFPE) blocks, but RNA degradation during fixation and storage has prevented their use for transcriptional profiling. New and highly sensitive assays for whole-transcriptome microarray analysis of FFPE tissues are now available, but resulting data include noise and variability for which previous expression array methods are inadequate.
We present the two largest whole-genome expression studies from FFPE tissues to date, comprising 1,003 colorectal cancer (CRC) and 168 breast cancer samples, combined with a meta-analysis of 14 new and published FFPE microarray datasets. We develop and validate quality control (QC) methods through technical replication, independent samples, comparison to results from fresh-frozen tissue, and recovery of expected associations between gene expression and protein abundance.
Archival tissues from large, multicenter studies showed a much wider range of transcriptional data quality relative to smaller or frozen tissue studies and required stringent QC for subsequent analysis. We developed novel methods for such QC of archival tissue expression profiles based on sample dynamic range and per-study median profile. This enabled validated identification of gene signatures of microsatellite instability and additional features of CRC, and improved recovery of associations between gene expression and protein abundance of MLH1, FASN, CDX2, MGMT, and SIRT1 in CRC tumors.
These methods for large-scale QC of FFPE expression profiles enable study of the cancer transcriptome in relation to extensive clinicopathological information, tumor molecular biomarkers, and long-term lifestyle and outcome data.
每年有超过 2000 万份存档组织样本以福尔马林固定、石蜡包埋(FFPE)的形式储存在美国,但固定和储存过程中的 RNA 降解阻止了它们用于转录谱分析。目前已经有新的、高度敏感的 FFPE 组织全转录组微阵列分析检测方法,但得到的数据包括噪音和可变性,而之前的表达阵列方法无法充分处理这些问题。
我们展示了迄今为止来自 FFPE 组织的两个最大的全基因组表达研究,包括 1003 例结直肠癌(CRC)和 168 例乳腺癌样本,结合了对 14 个新的和已发表的 FFPE 微阵列数据集的荟萃分析。我们通过技术复制、独立样本、与新鲜冷冻组织的结果比较以及恢复基因表达与蛋白质丰度之间预期的相关性,开发和验证了质量控制(QC)方法。
来自大型多中心研究的存档组织与较小的或冷冻组织研究相比,表现出更广泛的转录数据质量范围,并且需要严格的 QC 才能进行后续分析。我们基于样本动态范围和每个研究的中位数曲线,为存档组织表达谱开发了新的 QC 方法。这使我们能够验证地识别微卫星不稳定性的基因特征以及 CRC 的其他特征,并改善了 CRC 肿瘤中 MLH1、FASN、CDX2、MGMT 和 SIRT1 的基因表达与蛋白质丰度之间相关性的恢复。
这些用于 FFPE 表达谱大规模 QC 的方法能够研究癌症转录组与广泛的临床病理信息、肿瘤分子生物标志物以及长期生活方式和结局数据之间的关系。