Suppr超能文献

BDBM 1.0:一个用于高效检索和处理高质量序列数据的桌面应用程序,并应用于鉴定推测的咖啡 S 座位。

BDBM 1.0: A Desktop Application for Efficient Retrieval and Processing of High-Quality Sequence Data and Application to the Identification of the Putative Coffea S-Locus.

机构信息

ESEI-Escuela Superior de Ingeniería Informática, Universidade de Vigo, Edificio Politécnico, Campus Universitario As Lagoas s/n, 32004, Ourense, Spain.

CINBIO-Centro de Investigaciones Biomédicas, University of Vigo, Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain.

出版信息

Interdiscip Sci. 2019 Mar;11(1):57-67. doi: 10.1007/s12539-019-00320-3. Epub 2019 Feb 2.

Abstract

Nowadays, bioinformatics is one of the most important areas in modern biology and the creation of high-quality scientific software supporting this recent research area is one of the core activities of many researchers. In this context, high-quality sequence datasets are needed to perform inferences on the evolution of species, genes, and gene families, or to get evidence for adaptive amino acid evolution, among others. Nevertheless, sequence data are very often spread over several databases, many useful genomes and transcriptomes are non-annotated, the available annotation is not for the desired coding sequence isoform, and/or is unlikely to be accurate. Moreover, although the FASTA text-based format is quite simple and usable by most software applications, there are a number of issues that may be critical depending on the software used to analyse such files. Therefore, researchers without training in informatics often use a fraction of all available data. The above issues can be addressed using already available software applications, but there is no easy-to-use single piece of software that allows performing all these tasks within the same graphical interface, such as the one here presented, named BDBM (Blast DataBase Manager). BDBM can be used to efficiently get gene sequences from annotated and non-annotated genomes and transcriptomes. Moreover, it can be used to look for alternatives to existing annotations and to easily create reliable custom databases. Such databases are essential to prepare high-quality datasets. The analyses that we have performed on the Coffea canephora genome using BDBM aimed at the identification of the S-locus region (that harbours the genes involved in gametophytic self-incompatibility) led to the conclusion that there are two likely regions, one on chromosome 2 (around region 6600000-6650000), and another on chromosome 5 (around 15830000-15930000). Such findings are discussed in the context of the Rubiaceae gametophytic self-incompatibility evolution.

摘要

如今,生物信息学是现代生物学中最重要的领域之一,而创建支持这一新兴研究领域的高质量科学软件是许多研究人员的核心活动之一。在这种情况下,需要高质量的序列数据集来推断物种、基因和基因家族的进化,或获取适应氨基酸进化的证据等。然而,序列数据通常分布在多个数据库中,许多有用的基因组和转录组是非注释的,可用的注释不是所需的编码序列同工型,并且/或者不太可能准确。此外,尽管 FASTA 基于文本的格式非常简单,大多数软件应用程序都可以使用,但根据用于分析此类文件的软件,有许多问题可能是关键的。因此,没有信息学培训的研究人员通常只使用了所有可用数据的一小部分。上述问题可以通过使用现有的软件应用程序来解决,但没有一个简单易用的单一软件可以在同一个图形界面中完成所有这些任务,例如这里提出的名为 BDBM(Blast DataBase Manager)的软件。BDBM 可用于从注释和非注释的基因组和转录组中高效获取基因序列。此外,它还可用于寻找现有注释的替代方案,并轻松创建可靠的自定义数据库。这些数据库对于准备高质量数据集至关重要。我们使用 BDBM 对 Coffea canephora 基因组进行的分析旨在确定 S 座位区域(包含参与配子体自交不亲和的基因),得出的结论是有两个可能的区域,一个在染色体 2 上(约 6600000-6650000),另一个在染色体 5 上(约 15830000-15930000)。这些发现是在茜草科配子体自交不亲和进化的背景下讨论的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验