Higa Roberto Hiroshi, Cruz Sergio Aparecido Braga da, Kuser Paula Regina, Yamagishi Michel Eduardo Beleza, Fileto Renato, Oliveira Stanley Robson de Medeiros, Mazoni Ivan, Santos Edgard Henrique dos, Mancini Adauto Luiz, Neshich Goran
Centro Nacional de Pesquisa em Informática Agropecuária, Empresa Brasileira de Pesquisa Agropecuária, Campinas, SP, Brazil.
Genet Mol Res. 2006 Mar 31;5(1):127-37.
Homology-derived secondary structure of proteins (HSSP) is a well-known database of multiple sequence alignments (MSAs) which merges information of protein sequences and their three-dimensional structures. It is available for all proteins whose structure is deposited in the PDB. It is also used by STING and (Java)Protein Dossier to calculate and present relative entropy as a measure of the degree of conservation for each residue of proteins whose structure has been solved and deposited in the PDB. However, if the STING and (Java)Protein Dossier are to provide support for analysis of protein structures modeled in computers or being experimentally solved but not yet deposited in the PDB, then we need a new method for building alignments having a flavor of HSSP alignments (myMSAr). The present study describes a new method and its corresponding databank (SH2QS--database of sequences homologue to the query [structure-having] sequence). Our main interest in making myMSAr was to measure the degree of residue conservation for a given query sequence, regardless of whether it has a corresponding structure deposited in the PDB. In this study, we compare the measurement of residue conservation provided by corresponding alignments produced by HSSP and SH2QS. As a case study, we also present two biologically relevant examples, the first one highlighting the equivalence of analysis of the degree of residue conservation by using HSSP or SH2QS alignments, and the second one presenting the degree of residue conservation for a structure modeled in a computer, which , as a consequence, does not have an alignment reported by HSSP.
蛋白质同源性衍生二级结构(HSSP)是一个著名的多序列比对(MSA)数据库,它整合了蛋白质序列及其三维结构的信息。所有结构已存入蛋白质数据银行(PDB)的蛋白质都可使用该数据库。STING和(Java)蛋白质档案也利用它来计算和呈现相对熵,以此作为已解析结构并存入PDB的蛋白质每个残基保守程度的衡量指标。然而,如果STING和(Java)蛋白质档案要为计算机模拟的蛋白质结构分析或实验解析但尚未存入PDB的蛋白质结构分析提供支持,那么我们就需要一种构建具有HSSP比对风格比对(myMSAr)的新方法。本研究描述了一种新方法及其相应的数据库(SH2QS——与查询[具有结构的]序列同源的序列数据库)。我们构建myMSAr的主要目的是测量给定查询序列的残基保守程度,无论其是否有存入PDB的对应结构。在本研究中,我们比较了HSSP和SH2QS生成的相应比对所提供的残基保守程度测量结果。作为案例研究,我们还给出了两个生物学相关的例子,第一个例子突出了使用HSSP或SH2QS比对分析残基保守程度的等效性,第二个例子展示了计算机模拟结构的残基保守程度,因此该结构没有HSSP报告的比对。