Suppr超能文献

DWARF——一个用于分析蛋白质家族的数据仓库系统。

DWARF--a data warehouse system for analyzing protein families.

作者信息

Fischer Markus, Thai Quan K, Grieb Melanie, Pleiss Jürgen

机构信息

Institute of Technical Biochemistry, University of Stuttgart, Allmandring 31, D-70569, Germany.

出版信息

BMC Bioinformatics. 2006 Nov 9;7:495. doi: 10.1186/1471-2105-7-495.

Abstract

BACKGROUND

The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families.

DESCRIPTION

The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of alpha/beta-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families.

CONCLUSION

DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering.

摘要

背景

整合生物信息学这一新兴领域提供了工具,用于组织和系统分析大量高度多样化的生物数据,从而有助于对复杂生物系统有全新的理解。数据仓库DWARF将整合生物信息学方法应用于大型蛋白质家族的分析。

描述

数据仓库系统DWARF整合了蛋白质折叠家族的序列、结构和功能注释数据。底层关系数据模型由三个主要部分组成,分别代表与蛋白质相关的实体(生化功能、来源生物体、同源家族和超家族分类)、蛋白质序列(位置特异性注释、突变信息)以及蛋白质结构(二级结构信息、叠加的三级结构)。提供了从公共可用资源(ExPDB、GenBank、DSSP)提取、转换和加载数据的工具来填充数据库。数据可通过搜索和浏览界面以及基于注释、序列或结构运行的分析工具进行访问。我们将DWARF应用于α/β水解酶家族以承载脂肪酶工程数据库。版本2.3包含6138个序列和167个实验确定的蛋白质结构,这些被分配到37个超家族和103个同源家族。

结论

DWARF旨在构建大型结构相关蛋白质家族的数据库,并通过对序列、结构和功能注释的系统分析来评估它们的序列-结构-功能关系。它已被用于从序列预测生化特性,并作为蛋白质工程的宝贵工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cabb/1647292/59e7eba86a77/1471-2105-7-495-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验