Institute for Biomedical Informatics, Departments of Genetics and Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA, 17821, USA.
Nat Commun. 2017 Oct 27;8(1):1167. doi: 10.1038/s41467-017-00802-2.
Genome-wide, imputed, sequence, and structural data are now available for exceedingly large sample sizes. The needs for data management, handling population structure and related samples, and performing associations have largely been met. However, the infrastructure to support analyses involving complexity beyond genome-wide association studies is not standardized or centralized. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene-environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. Using the data from the Marshfield Personalized Medicine Research Project, a site in the electronic Medical Records and Genomics Network, we apply each feature of PLATO to type 2 diabetes and demonstrate how PLATO can be used to uncover the complex etiology of common traits.
现在已经可以获得针对极大样本量的全基因组、推断、序列和结构数据。数据管理、处理群体结构和相关样本以及进行关联的需求在很大程度上已经得到满足。然而,支持涉及全基因组关联研究以外的复杂性的分析的基础设施尚未标准化或集中化。我们提供了用于分析、翻译和组织大规模数据的平台(PLATO),这是一个软件工具,配备了处理数十万样本的多组学数据的能力,以使用遗传相互作用、全环境关联研究和基因-环境相互作用、表型全关联研究以及拷贝数和罕见变异分析来探索复杂性。使用电子病历和基因组学网络中的 Marshfield 个体化医学研究项目的数据,我们应用 PLATO 的每个功能来研究 2 型糖尿病,并展示 PLATO 如何用于揭示常见特征的复杂病因。