Suppr超能文献

禽类免疫组数据库:一个用户友好界面提取遗传信息的范例。

Avian Immunome DB: an example of a user-friendly interface for extracting genetic information.

机构信息

Department of Migration, Max Planck Institute of Animal Behavior, Am Obstberg, 78315, Radolfzell, Germany.

Department of Biology, University of Konstanz, Universitaetsstrasse 10, 78464, Konstanz, Germany.

出版信息

BMC Bioinformatics. 2020 Nov 12;21(1):502. doi: 10.1186/s12859-020-03764-3.

Abstract

BACKGROUND

Genomic and genetic studies often require a target list of genes before conducting any hypothesis testing or experimental verification. With the ever-growing number of sequenced genomes and a variety of different annotation strategies, comes the potential for ambiguous gene symbols, making it cumbersome to capture the "correct" set of genes. In this article, we present and describe the Avian Immunome DB (AVIMM) for easy gene property extraction as exemplified by avian immune genes. The avian immune system is characterised by a cascade of complex biological processes underlaid by more than 1000 different genes. It is a vital trait to study particularly in birds considering that they are a significant driver in spreading zoonotic diseases. With the completion of phase II of the B10K ("Bird 10,000 Genomes") consortium's whole-genome sequencing effort, we have included 363 annotated bird genomes in addition to other publicly available bird genome data which serve as a valuable foundation for AVIMM.

CONSTRUCTION AND CONTENT

A relational database with avian immune gene evidence from Gene Ontology, Ensembl, UniProt and the B10K consortium has been designed and set up. The foundation stone or the "seed" for the initial set of avian immune genes is based on the well-studied model organism chicken (Gallus gallus). Gene annotations, different transcript isoforms, nucleotide sequences and protein information, including amino acid sequences, are included. Ambiguous gene names (symbols) are resolved within the database and linked to their canonical gene symbol. AVIMM is supplemented by a command-line interface and a web front-end to query the database.

UTILITY AND DISCUSSION

The internal mapping of unique gene symbol identifiers to canonical gene symbols allows for an ambiguous gene property search. The database is organised within core and feature tables, which makes it straightforward to extend for future purposes. The database design is ready to be applied to other taxa or biological processes. Currently, the database contains 1170 distinct avian immune genes with canonical gene symbols and 612 synonyms across 363 bird species. While the command-line interface readily integrates into bioinformatics pipelines, the intuitive web front-end with download functionality offers sophisticated search functionalities and tracks the origin for each record. AVIMM is publicly accessible at https://avimm.ab.mpg.de .

摘要

背景

基因组和遗传学研究通常需要在进行任何假设检验或实验验证之前,先确定一个目标基因列表。随着测序基因组数量的不断增加,以及各种不同的注释策略的出现,基因符号可能会变得模糊,这使得捕捉“正确”的基因集变得很麻烦。在本文中,我们提出并描述了 Avian Immunome DB(AVIMM),用于轻松提取基因属性,以禽类免疫基因为例。禽类免疫系统的特点是一系列复杂的生物学过程,由 1000 多个不同的基因支撑。考虑到鸟类是传播人畜共患病的重要驱动因素,研究它们的免疫系统是至关重要的。随着 B10K(“鸟类 10000 个基因组”)联盟全基因组测序工作第二阶段的完成,我们除了包含其他公开的鸟类基因组数据外,还增加了 363 个注释鸟类基因组,这些数据为 AVIMM 提供了宝贵的基础。

构建和内容

设计并建立了一个具有禽类免疫基因证据的关系型数据库,这些证据来自基因本体论、Ensembl、UniProt 和 B10K 联盟。初始禽类免疫基因集的基础或“种子”是基于经过充分研究的模式生物鸡(Gallus gallus)。该数据库包括基因注释、不同的转录异构体、核苷酸序列和蛋白质信息,包括氨基酸序列。数据库内解决了基因名称(符号)的歧义问题,并将其链接到规范的基因符号。AVIMM 还配备了命令行界面和网络前端,以查询数据库。

实用性和讨论

将唯一的基因符号标识符与规范基因符号进行内部映射,允许进行模糊基因属性搜索。数据库组织在核心表和特征表中,这使得为未来的目的扩展变得简单。数据库设计已经准备好应用于其他分类单元或生物过程。目前,该数据库包含 363 种鸟类中的 1170 个独特的禽类免疫基因,具有规范的基因符号和 612 个同义词。虽然命令行界面可以轻松集成到生物信息学管道中,但直观的网络前端具有下载功能,提供了复杂的搜索功能,并跟踪每个记录的来源。AVIMM 可在 https://avimm.ab.mpg.de 公开访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bfaa/7661159/db96d39790a7/12859_2020_3764_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验