Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington, USA.
Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA.
Nat Genet. 2017 Oct 27;49(11):1560-1563. doi: 10.1038/ng.3968.
The exploding volume of whole-genome sequence (WGS) and multi-omics data requires new approaches for analysis. As one solution, we have created a cloud-based Analysis Commons, which brings together genotype and phenotype data from multiple studies in a setting that is accessible by multiple investigators. This framework addresses many of the challenges of multi-center WGS analyses, including data sharing mechanisms, phenotype harmonization, integrated multi-omics analyses, annotation, and computational flexibility. In this setting, the computational pipeline facilitates a sequence-to-discovery analysis workflow illustrated here by an analysis of plasma fibrinogen levels in 3996 individuals from the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) WGS program. The Analysis Commons represents a novel model for transforming WGS resources from a massive quantity of phenotypic and genomic data into knowledge of the determinants of health and disease risk in diverse human populations.
全基因组序列(WGS)和多组学数据的爆炸式增长需要新的分析方法。作为解决方案之一,我们创建了一个基于云的分析公共平台,将来自多个研究的基因型和表型数据汇集在一起,供多个研究人员访问。该框架解决了多中心 WGS 分析的许多挑战,包括数据共享机制、表型协调、综合多组学分析、注释和计算灵活性。在这种情况下,计算流程促进了从序列到发现的分析工作流程,这里通过对来自美国国立心肺血液研究所(NHLBI)转化医学精准医学(TOPMed)WGS 计划的 3996 个人的血浆纤维蛋白原水平的分析说明了这一点。分析公共平台代表了一种将 WGS 资源从大量表型和基因组数据转化为不同人群健康和疾病风险决定因素的知识的新型模型。