Suppr超能文献

SALAD 数据库:基于模体的植物比较基因组学蛋白质注释数据库。

SALAD database: a motif-based database of protein annotations for plant comparative genomics.

机构信息

Plant Genomics Research Unit and Bioinformatics Research Unit, National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba 305-8602, Japan.

出版信息

Nucleic Acids Res. 2010 Jan;38(Database issue):D835-42. doi: 10.1093/nar/gkp831. Epub 2009 Oct 23.

Abstract

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.

摘要

蛋白质通常具有几个具有不同进化历史的基序。具有相似基序的蛋白质具有相似的生化特性,因此具有相关的生物学功能。我们从基于植物基因组的蛋白质组数据集构建了一个独特的比较基因组学数据库,称为 SALAD 数据库(http://salad.dna.affrc.go.jp/salad/)。我们从 10 个物种的蛋白质组数据集中使用 BLASTP 从 209529 个蛋白质序列注释组中提取了进化保守的基序,MEME 软件从这些组中提取。对每个蛋白质组的序列模式进行相似性聚类。SALAD 数据库提供了一个用户友好的图形查看器,显示了与每个蛋白质组的生成自举树相关联的基序模式图。还提供了基于氨基酸序列和核苷酸序列的 motif 组合比对系统发育树、树中每个分支的 logo 比较图以及 Pfam 结构域模式图。我们还开发了一个名为“SALAD on ARRAYs”的查看器,用于在窗口中查看与同一树中链接的任意基因芯片数据集。SALAD 数据库是比较蛋白质序列的强大工具,可以为生物分析提供有价值的提示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b1a/2808985/82fd4efec5e6/gkp831f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验