Suppr超能文献

glactools:用于管理基因型可能性和等位基因计数的命令行工具集。

glactools: a command-line toolset for the management of genotype likelihoods and allele counts.

机构信息

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany.

出版信息

Bioinformatics. 2018 Apr 15;34(8):1398-1400. doi: 10.1093/bioinformatics/btx749.

Abstract

MOTIVATION

Research projects involving population genomics routinely need to store genotyping information, population allele counts, combine files from different samples, query the data and export it to various formats. This is often done using bespoke in-house scripts, which cannot be easily adapted to new projects and seldom constitute reproducible workflows.

RESULTS

We introduce glactools, a set of command-line utilities that can import data from genotypes or population-wide allele counts into an intermediate representation, compute various operations on it and export the data to several file formats used by population genetics software. This intermediate format can take two forms, one to store per-individual genotype likelihoods and a second for allele counts from one or more individuals. glactools allows users to perform operations such as intersecting datasets, merging individuals into populations, creating subsets, perform queries (e.g. return sites where a given population does not share an allele with a second one) and compute summary statistics to answer biologically relevant questions.

AVAILABILITY AND IMPLEMENTATION

glactools is freely available for use under the GPL. It requires a C ++ compiler and the htslib library. The source code and the instructions about how to download test data are available on the website (https://grenaud.github.io/glactools/).

CONTACT

gabriel.reno@gmail.com.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

涉及群体基因组学的研究项目通常需要存储基因分型信息、群体等位基因计数、组合来自不同样本的文件、查询数据并将其导出到各种格式。这通常是使用定制的内部脚本完成的,这些脚本不容易适应新项目,并且很少构成可重复的工作流程。

结果

我们引入了 glactools,这是一组命令行实用程序,可以将基因型或全人群等位基因计数数据导入到中间表示形式中,对其进行各种操作,并将数据导出到群体遗传学软件使用的几种文件格式。这种中间格式可以采用两种形式,一种用于存储每个个体的基因型似然,另一种用于存储一个或多个个体的等位基因计数。glactools 允许用户执行诸如数据集交集、将个体合并到群体中、创建子集、执行查询(例如,返回给定群体与第二个群体没有共享等位基因的位点)和计算汇总统计信息以回答生物学上相关的问题。

可用性和实现

glactools 可根据 GPL 免费使用。它需要 C++编译器和 htslib 库。源代码和有关如何下载测试数据的说明可在网站上获得(https://grenaud.github.io/glactools/)。

联系人

gabriel.reno@gmail.com

补充信息

补充数据可在 Bioinformatics 在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验