Hanisch Daniel, Zimmer Ralf, Lengauer Thomas
Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, D-53754 Sankt Augustin, Germany.
In Silico Biol. 2002;2(3):313-24.
We propose a specification language ProML for protein sequences, structures, and families based on the open XML standard. The language allows for portable, system-independent, machine-parsable and human-readable representation of essential features of proteins. The language is of immediate use for several bioinformatics applications: we discuss clustering of proteins into families and the representation of the specific shared features of the respective clusters. Moreover, we use ProML for specification of data used in fold recognition bench-marks exploiting experimentally derived distance constraints.
我们基于开放的XML标准提出了一种用于蛋白质序列、结构和家族的规范语言ProML。该语言允许对蛋白质的基本特征进行可移植、独立于系统、机器可解析且人类可读的表示。该语言可立即用于多种生物信息学应用:我们讨论了将蛋白质聚类成家族以及各个聚类的特定共享特征的表示。此外,我们使用ProML来规范在利用实验得出的距离约束的折叠识别基准中使用的数据。