Holm L, Sander C
European Bioinformatics Institute, EMBL-EBI, Genome Campus, Cambridge CB10 1SD, UK.
Nucleic Acids Res. 1999 Jan 1;27(1):244-7. doi: 10.1093/nar/27.1.244.
Dali and HSSP are derived databases organizing protein space in the structurally known regions. We use an automatic structure alignment program (Dali) for the classification of all known 3D structures based on all-against-all comparison of 3D structures in the Protein Data Bank. The HSSP database associates 1D sequences with known 3D structures using a position-weighted dynamic programming method for sequence profile alignment (MaxHom). As a result, the HSSP database not only provides aligned sequence families, but also implies secondary and tertiary structures covering 36% of all sequences in Swiss-Prot. The structure classification by Dali and the sequence families in HSSP can be browsed jointly from a web interface providing a rich network of links between neighbours in fold space, between domains and proteins, and between structures and sequences. In particular, this results in a database of explicit multiple alignments of protein families in the twilight zone of sequence similarity. The organization of protein structures and families provides a map of the currently known regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The databases are available from http://www.embl-ebi.ac.uk/dali/
Dali和HSSP是在已知结构区域组织蛋白质空间的衍生数据库。我们使用自动结构比对程序(Dali),基于蛋白质数据库中所有三维结构的全对全比较,对所有已知的三维结构进行分类。HSSP数据库使用位置加权动态规划方法(MaxHom)进行序列轮廓比对,将一维序列与已知的三维结构相关联。因此,HSSP数据库不仅提供比对后的序列家族,还暗示了覆盖Swiss-Prot中36%所有序列的二级和三级结构。通过Dali进行的结构分类和HSSP中的序列家族可以通过一个网络界面共同浏览,该界面在折叠空间中的相邻结构、结构域与蛋白质之间以及结构与序列之间提供了丰富的链接网络。特别是,这产生了一个处于序列相似性模糊区域的蛋白质家族明确多重比对的数据库。蛋白质结构和家族的组织提供了一张当前已知的蛋白质宇宙区域图,这对于分析折叠原理、蛋白质家族的进化统一以及最大化从实验结构测定中获得的信息回报很有用。这些数据库可从http://www.embl-ebi.ac.uk/dali/获取。