Suppr超能文献

用于比较基因组学的同源基因家族数据库。

Databases of homologous gene families for comparative genomics.

作者信息

Penel Simon, Arigon Anne-Muriel, Dufayard Jean-François, Sertier Anne-Sophie, Daubin Vincent, Duret Laurent, Gouy Manolo, Perrière Guy

机构信息

Laboratoire de Biométrie et Biologie Evolutive, CNRS, Université Claude Bernard - Lyon 1, 43 bd, du 11 Novembre 1918, 69622 Villeurbanne Cedex, France.

出版信息

BMC Bioinformatics. 2009 Jun 16;10 Suppl 6(Suppl 6):S3. doi: 10.1186/1471-2105-10-S6-S3.

Abstract

BACKGROUND

Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable.

METHODS

We developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster.

RESULTS

Three databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at http://pbil.univ-lyon1.fr/.

摘要

背景

比较基因组学是许多序列分析研究的核心步骤,从基因注释、基因组中新功能区域的鉴定,到分子水平上进化过程的研究(物种形成、单基因或全基因组重复等)以及系统发育学。在这种情况下,能够为用户提供高质量同源家族、序列比对以及基于先进算法的系统发育树的数据库正变得不可或缺。

方法

我们开发了一种自动化程序,可进行大规模的全对全相似性搜索、基因聚类、多序列比对计算以及系统发育树的构建与整合。通过在大型计算机集群上进行并行计算,可以将此程序应用于非常大的序列集。

结果

使用该程序开发了三个数据库:HOVERGEN、HOGENOM和HOMOLENS。这些数据库具有相同的架构,但内容有所不同。HOVERGEN包含脊椎动物的序列,HOGENOM主要专注于已完全测序的微生物,而HOMOLENS专注于来自Ensembl中的后生动物基因组。可通过网页查询表单、通用检索系统和客户端 - 服务器图形界面访问这些数据库。后者可用于执行基于树模式的搜索,除其他用途外,还可检索直系同源基因集。这三个数据库以及构建和查询它们所需的软件均可从PBIL(里昂生物信息学中心)网站http://pbil.univ-lyon1.fr/使用或下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf90/2697650/6706f0031258/1471-2105-10-S6-S3-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验