Cai Haoyang, Gupta Saumya, Rath Prisni, Ai Ni, Baudis Michael
Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland Swiss Institute of Bioinformatics, 8057 Zurich, Switzerland Center of Growth, Metabolism, and Aging, Key Laboratory of Bio-Resources and Eco-Environment, College of Life Sciences, Sichuan University, Chengdu 610064, Sichuan, China
Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland Swiss Institute of Bioinformatics, 8057 Zurich, Switzerland.
Nucleic Acids Res. 2015 Jan;43(Database issue):D825-30. doi: 10.1093/nar/gku1123. Epub 2014 Nov 26.
Somatic copy number aberrations (CNA) represent a mutation type encountered in the majority of cancer genomes. Here, we present the 2014 edition of arrayMap (http://www.arraymap.org), a publicly accessible collection of pre-processed oncogenomic array data sets and CNA profiles, representing a vast range of human malignancies. Since the initial release, we have enhanced this resource both in content and especially with regard to data mining support. The 2014 release of arrayMap contains more than 64,000 genomic array data sets, representing about 250 tumor diagnoses. Data sets included in arrayMap have been assembled from public repositories as well as additional resources, and integrated by applying custom processing pipelines. Online tools have been upgraded for a more flexible array data visualization, including options for processing user provided, non-public data sets. Data integration has been improved by mapping to multiple editions of the human reference genome, with the majority of the data now being available for the UCSC hg18 as well as GRCh37 versions. The large amount of tumor CNA data in arrayMap can be freely downloaded by users to promote data mining projects, and to explore special events such as chromothripsis-like genome patterns.
体细胞拷贝数畸变(CNA)是大多数癌症基因组中出现的一种突变类型。在此,我们展示2014年版的arrayMap(http://www.arraymap.org),这是一个可公开访问的预处理肿瘤基因组阵列数据集和CNA图谱集合,涵盖了广泛的人类恶性肿瘤。自首次发布以来,我们在内容方面,特别是在数据挖掘支持方面对该资源进行了增强。2014年发布的arrayMap包含超过64000个基因组阵列数据集,代表约250种肿瘤诊断。arrayMap中包含的数据集是从公共存储库以及其他资源中汇集而来,并通过应用定制处理管道进行整合。在线工具已升级,以实现更灵活的阵列数据可视化,包括处理用户提供的非公共数据集的选项。通过映射到人类参考基因组的多个版本,数据整合得到了改进,现在大多数数据可用于UCSC hg18以及GRCh37版本。用户可以免费下载arrayMap中的大量肿瘤CNA数据,以推动数据挖掘项目,并探索诸如类染色体碎裂基因组模式等特殊事件。