Genomics England, Charterhouse Square, London EC1M 6BQ, UK.
Department of Haematology, University of Cambridge, Cambridge CB2 0PT, UK.
Nucleic Acids Res. 2017 Jul 3;45(W1):W189-W194. doi: 10.1093/nar/gkx445.
High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets.
高通量基因组变异项目,如 1000 基因组计划或外显子聚集联盟,正在产生大量人类基因组变异知识,可作为鉴定致病基因型的重要参考。然而,访问这些数据、对比各种研究以及将这些数据整合到下游分析中仍然很麻烦。人类基因组变异档案(HGVA)解决了这些挑战,以简洁、快速和集成的方式为关键参考项目提供基因组数据访问。HGVA 为轻松的数据挖掘提供了高效直观的 Web 界面、全面的基于 REST 的 API 和 Python、Java 和 JavaScript 中的客户端库,用于快速编程访问其知识库。HGVA 为这些项目计算了人群频率,并使用 CellBase 提供的变异注释丰富了其数据,CellBase 是一种快速注释解决方案。HGVA 是剑桥大学与英国 10 万基因组计划和国家健康研究所生物资源罕见疾病合作进行的基因组分析开发的概念验证,特别是部署了用于存储和分析大规模基因组数据集的开源计算生物学(OpenCB)软件平台。