Suppr超能文献

奥德赛:一个用于全基因组遗传数据相位、插补和分析的半自动流水线。

Odyssey: a semi-automated pipeline for phasing, imputation, and analysis of genome-wide genetic data.

机构信息

Department of Biology, Indiana University-Purdue University Indianapolis, 723 W. Michigan Street, Indianapolis, IN, USA.

Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 5021 Health Information and Translational Sciences (HITS), 410 West 10th Street, Indianapolis, IN, USA.

出版信息

BMC Bioinformatics. 2019 Jun 28;20(1):364. doi: 10.1186/s12859-019-2964-5.

Abstract

BACKGROUND

Genome imputation, admixture resolution and genome-wide association analyses are timely and computationally intensive processes with many composite and requisite steps. Analysis time increases further when building and installing the run programs required for these analyses. For scientists that may not be as versed in programing language, but want to perform these operations hands on, there is a lengthy learning curve to utilize the vast number of programs available for these analyses.

RESULTS

In an effort to streamline the entire process with easy-to-use steps for scientists working with big data, the Odyssey pipeline was developed. Odyssey is a simplified, efficient, semi-automated genome-wide imputation and analysis pipeline, which prepares raw genetic data, performs pre-imputation quality control, phasing, imputation, post-imputation quality control, population stratification analysis, and genome-wide association with statistical data analysis, including result visualization. Odyssey is a pipeline that integrates programs such as PLINK, SHAPEIT, Eagle, IMPUTE, Minimac, and several R packages, to create a seamless, easy-to-use, and modular workflow controlled via a single user-friendly configuration file. Odyssey was built with compatibility in mind, and thus utilizes the Singularity container solution, which can be run on Linux, MacOS, and Windows platforms. It is also easily scalable from a simple desktop to a High-Performance System (HPS).

CONCLUSION

Odyssey facilitates efficient and fast genome-wide association analysis automation and can go from raw genetic data to genome: phenome association visualization and analyses results in 3-8 h on average, depending on the input data, choice of programs within the pipeline and available computer resources. Odyssey was built to be flexible, portable, compatible, scalable, and easy to setup. Biologists less familiar with programing can now work hands on with their own big data using this easy-to-use pipeline.

摘要

背景

基因组测序、混合分辨率和全基因组关联分析是及时且计算密集型的过程,具有许多组合和必要的步骤。当构建和安装这些分析所需的运行程序时,分析时间会进一步增加。对于那些可能不精通编程语言但希望亲自进行这些操作的科学家来说,要利用大量可用的程序来进行这些分析,他们需要花费很长的时间来学习。

结果

为了简化整个过程,为处理大数据的科学家提供易于使用的步骤,开发了 Odyssey 管道。Odyssey 是一个简化的、高效的、半自动的全基因组测序和分析管道,它可以准备原始遗传数据,执行预测序质量控制、相位分析、测序、后测序质量控制、人群分层分析以及与统计数据分析的全基因组关联分析,包括结果可视化。Odyssey 是一个集成了 PLINK、SHAPEIT、Eagle、IMPUTE、Minimac 和几个 R 包的程序的管道,创建了一个无缝、易于使用且模块化的工作流程,通过单个用户友好的配置文件进行控制。Odyssey 是为了兼容性而构建的,因此它利用了 Singularity 容器解决方案,可以在 Linux、MacOS 和 Windows 平台上运行。它还可以从简单的桌面轻松扩展到高性能系统 (HPS)。

结论

Odyssey 促进了高效快速的全基因组关联分析自动化,从原始遗传数据到基因组:表型关联可视化和分析结果,平均在 3-8 小时内,具体取决于输入数据、管道内程序的选择以及可用的计算机资源。Odyssey 构建时具有灵活性、可移植性、兼容性、可扩展性和易于设置。不太熟悉编程的生物学家现在可以使用这个易于使用的管道来处理他们自己的大数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a88/6599316/8d352651644f/12859_2019_2964_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验