Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
BMC Bioinformatics. 2011 Feb 3;12:46. doi: 10.1186/1471-2105-12-46.
DNA microarrays have become a nearly ubiquitous tool for the study of human disease, and nowhere is this more true than in cancer. With hundreds of studies and thousands of expression profiles representing the majority of human cancers completed and in public databases, the challenge has been effectively accessing and using this wealth of data.
To address this issue we have collected published human cancer gene expression datasets generated on the Affymetrix GeneChip platform, and carefully annotated those studies with a focus on providing accurate sample annotation. To facilitate comparison between datasets, we implemented a consistent data normalization and transformation protocol and then applied stringent quality control procedures to flag low-quality assays.
The resulting resource, the GeneChip Oncology Database, is available through a publicly accessible website that provides several query options and analytical tools through an intuitive interface.
DNA 微阵列已成为研究人类疾病的几乎无处不在的工具,在癌症领域更是如此。数百项研究和数千个代表大多数人类癌症的表达谱已经完成,并在公共数据库中,挑战在于有效地访问和使用这些丰富的数据。
为了解决这个问题,我们收集了在 Affymetrix GeneChip 平台上生成的已发表的人类癌症基因表达数据集,并仔细注释了这些研究,重点是提供准确的样本注释。为了便于数据集之间的比较,我们实现了一致的数据标准化和转换协议,然后应用严格的质量控制程序来标记低质量的检测。
由此产生的资源,即 GeneChip Oncology Database,可通过一个公共可访问的网站获得,该网站通过直观的界面提供了几种查询选项和分析工具。