Kang Ho-Young, Kim Jihyun F, Kim Myung Hee, Park Seung-Hwan, Oh Tae-Kwang, Hur Cheol-Goo
Genome Research Center, Korea Research Institute of Bioscience and Biotechnology, KRIBB, P.O. Box 115, Yuseong, Daejeon 305-600, Republic of Korea.
FEBS Lett. 2006 May 15;580(11):2736-40. doi: 10.1016/j.febslet.2006.04.034. Epub 2006 Apr 21.
MELDB is a comprehensive protein database of microbial esterases and lipases which are hydrolytic enzymes important in the modern industry. Proteins in MELDB are clustered into groups according to their sequence similarities based on a local pairwise alignment algorithm and a graph clustering algorithm (TribeMCL). This differs from traditional approaches that use global pairwise alignment and joining methods. Our procedure was able to reduce the noise caused by dubious alignment in the distantly related or unrelated regions in the sequences. In the database, 883 esterase and lipase sequences derived from microbial sources are deposited and conserved parts of each protein are identified. HMM profiles of each cluster were generated to classify unknown sequences. Contents of the database can be keyword-searched and query sequences can be aligned to sequence profiles and sequences themselves.
MELDB是一个关于微生物酯酶和脂肪酶的综合蛋白质数据库,这些酶是现代工业中重要的水解酶。基于局部双序列比对算法和图聚类算法(TribeMCL),MELDB中的蛋白质根据其序列相似性被聚类成组。这与使用全局双序列比对和连接方法的传统方法不同。我们的程序能够减少序列中远距离相关或不相关区域中可疑比对所造成的噪声。数据库中存有883条来源于微生物的酯酶和脂肪酶序列,并识别出了每种蛋白质的保守部分。生成了每个聚类的隐马尔可夫模型(HMM)谱来对未知序列进行分类。数据库内容可通过关键词搜索,查询序列可与序列谱和序列本身进行比对。