CIIMAR/CIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Porto, Portugal.
Evol Bioinform Online. 2013 Nov 24;9:487-90. doi: 10.4137/EBO.S11335. eCollection 2013.
The rapid advances in genome sequencing technologies have increased the pace at which biological sequence databases are becoming available to the broad scientific community. Thus, obtaining and preparing an appropriate sequence dataset is a crucial first step for all types of genomic analyses. Here, we present a script that can widely facilitate the easy, fast, and effortless downloading and preparation of a proper biological sequence dataset for various genomics studies. This script retrieves Ensembl defined genomic features, associated with a given Ensembl identifier. Coding (CDS) and genomic sequences can be easily retrieved based on a selected relationship from a set of relationship types, either considering all available organisms or a user specified subset of organisms. The script is very user-friendly and by default starts with an interactive mode if no command-line options are specified.
基因组测序技术的快速发展加快了生物序列数据库向广大科学界开放的速度。因此,获取和准备适当的序列数据集是进行各种基因组分析的关键第一步。在这里,我们提供了一个脚本,可以广泛地促进各种基因组研究中适当的生物序列数据集的轻松、快速和轻松下载和准备。该脚本检索与给定 Ensembl 标识符相关的 Ensembl 定义的基因组特征。可以根据一组关系类型中选择的关系,轻松地从所有可用的生物体或用户指定的生物体子集检索编码(CDS)和基因组序列。该脚本非常用户友好,默认情况下,如果未指定命令行选项,则以交互模式启动。