Huysmans M, Richelle J, Wodak S J
BIM, Everberg, Belgium.
Proteins. 1991;11(1):59-76. doi: 10.1002/prot.340110108.
A system is described that provides ways of integrating data on protein structure, sequence, and survey results, with molecular graphics and molecular mechanics software. Its major component is the relational database SESAM, presently implemented under the commercial package SYBASE. By design, the database allows full integration--within the same data organization--of raw data on protein structure, sequence, ligands, and heterogroups, obtained from the Brookhaven Protein Databank, with pure sequence information available from other databanks such as SWISS-PROT. It contains in addition higher level descriptions of structural and topological properties, as well as survey results, obtained by executing specialized computer programs. Aside from the very useful attribute of closely combining structural and nonstructural information, other important features distinguish it from analogous systems developed elsewhere. It includes a molecular dictionary with complete description of geometric properties and energy parameters used in modeling and conformational energy calculations. Using this dictionary, structural data are validated by checking for localized inconsistencies in atomic coordinates, atomic symbols, chirality definitions, and flagging errors and incomplete entries. Because of both the dictionary and the validation procedures, SESAM can be readily interfaced with conventional molecular graphics and mechanics software packages, or with other specialized application programs. With the aid of appropriate interfaces, data access is sufficiently fast for SESAM to be interrogated interactively. Prototypes of user interfaces, as well as an interface with the molecular graphics package BRUGEL, are described and the power of the system is illustrated in applications such as homology-based protein modeling, computer-aided protein design, protein structure predictions, analysis of local structure motifs, and of relationships between protein sequence and structure.
本文描述了一种系统,该系统提供了将蛋白质结构、序列和调查结果的数据与分子图形和分子力学软件进行整合的方法。其主要组件是关系数据库SESAM,目前在商业软件包SYBASE下实现。通过设计,该数据库允许在同一数据组织内将从布鲁克海文蛋白质数据库获得的关于蛋白质结构、序列、配体和异质基团的原始数据与从其他数据库(如SWISS-PROT)获得的纯序列信息进行完全整合。此外,它还包含通过执行专门的计算机程序获得的结构和拓扑特性的高级描述以及调查结果。除了将结构信息和非结构信息紧密结合这一非常有用的特性外,其他重要特征使其有别于其他地方开发的类似系统。它包括一个分子字典,其中完整描述了建模和构象能量计算中使用的几何特性和能量参数。使用这个字典,通过检查原子坐标、原子符号、手性定义中的局部不一致性以及标记错误和不完整条目来验证结构数据。由于有了这个字典和验证程序,SESAM可以很容易地与传统的分子图形和力学软件包或其他专门的应用程序接口。借助适当的接口,数据访问速度足够快,使得可以对SESAM进行交互式查询。描述了用户界面的原型以及与分子图形软件包BRUGEL的接口,并在基于同源性的蛋白质建模、计算机辅助蛋白质设计、蛋白质结构预测、局部结构基序分析以及蛋白质序列与结构之间的关系等应用中展示了该系统的强大功能。