Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA.
Database (Oxford). 2013 Apr 2;2013:bat013. doi: 10.1093/database/bat013. Print 2013.
This article introduces a manually curated data collection for gene expression meta-analysis of patients with ovarian cancer and software for reproducible preparation of similar databases. This resource provides uniformly prepared microarray data for 2970 patients from 23 studies with curated and documented clinical metadata. It allows users to efficiently identify studies and patient subgroups of interest for analysis and to perform meta-analysis immediately without the challenges posed by harmonizing heterogeneous microarray technologies, study designs, expression data processing methods and clinical data formats. We confirm that the recently proposed biomarker CXCL12 is associated with patient survival, independently of stage and optimal surgical debulking, which was possible only through meta-analysis owing to insufficient sample sizes of the individual studies. The database is implemented as the curatedOvarianData Bioconductor package for the R statistical computing language, providing a comprehensive and flexible resource for clinically oriented investigation of the ovarian cancer transcriptome. The package and pipeline for producing it are available from http://bcb.dfci.harvard.edu/ovariancancer.
这篇文章介绍了一个手动整理的卵巢癌患者基因表达荟萃分析数据集,以及用于可重复准备类似数据库的软件。该资源为 23 项研究中的 2970 名患者提供了经过精心整理和记录的临床元数据的统一准备的微阵列数据。它允许用户高效地识别感兴趣的研究和患者亚组进行分析,并立即进行荟萃分析,而不会面临协调异质微阵列技术、研究设计、表达数据处理方法和临床数据格式所带来的挑战。我们证实,最近提出的生物标志物 CXCL12 与患者的生存相关,与分期和最佳手术减瘤无关,这只有通过荟萃分析才有可能,因为单个研究的样本量不足。该数据库作为 curateOvarianData Bioconductor 包实现,用于 R 统计计算语言,为临床导向的卵巢癌转录组研究提供了全面灵活的资源。用于生成该数据库的软件包和管道可从 http://bcb.dfci.harvard.edu/ovariancancer 获得。