Michalickova Katerina, Bader Gary D, Dumontier Michel, Lieu Hao, Betel Doron, Isserlin Ruth, Hogue Christopher W V
Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8.
BMC Bioinformatics. 2002 Oct 25;3:32. doi: 10.1186/1471-2105-3-32.
SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment.
SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries.
The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.
SeqHound已被开发为一个集成的生物序列、分类学、注释和三维结构数据库系统。它在本地托管环境中为生物信息学研究提供了一个高性能服务器平台。
SeqHound基于美国国立生物技术信息中心的数据模型和编程工具。除了三维结构数据以及有关序列冗余、序列邻接、分类学、完整基因组、包括基因本体术语的功能注释和与PubMed的文献链接等信息外,它还提供所有Entrez序列数据库的每日更新内容。可通过网页服务器,经由Perl、C或C++远程应用程序编程接口(API)或优化的本地API访问SeqHound。它提供检索序列、结构和结构域的特定子集所需的功能。序列可以以FASTA、GenBank、ASN.1和XML格式检索。结构以ASN.1、XML和PDB格式提供。API重点关注完整基因组、分类学、结构域和功能注释以及三维结构功能,而字段文本索引功能仍在开发中。SeqHound还提供了一个简化的万维网界面,用于简单的网络用户查询。
该系统已在多个已发表的生物信息学项目(如BIND数据库)中证明有用,并为研究提供了具有成本效益的基础设施。SeqHound将继续开发,并作为塞缪尔·伦嫩费尔德研究所蓝图计划的一项服务提供。源代码和示例可在Sourceforge网站http://sourceforge.net/projects/slritools/的SLRI工具包中根据GNU公共许可证的条款获得。