Liu Xin, Zheng Wei-Mou
Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080, China.
J Bioinform Comput Biol. 2006 Jun;4(3):769-82. doi: 10.1142/s0219720006002156.
Amino acid substitution matrices play an essential role in protein sequence alignment, a fundamental task in bioinformatics. Most widely used matrices, such as PAM matrices derived from homologous sequences and BLOSUM matrices derived from aligned segments of PROSITE, did not integrate conformation information in their construction. There are a few structure-based matrices, which are derived from limited data of structure alignment. Using databases PDB_SELECT and DSSP, we create a database of sequence-conformation blocks which explicitly represent sequence-structure relationship. Members in a block are identical in conformation and are highly similar in sequence. From this block database, we derive a conformation-specific amino acid substitution matrix CBSM60. The matrix shows an improved performance in conformational segment search and homolog detection.
氨基酸替换矩阵在蛋白质序列比对中起着至关重要的作用,这是生物信息学中的一项基本任务。大多数广泛使用的矩阵,如从同源序列推导而来的PAM矩阵和从PROSITE比对片段推导而来的BLOSUM矩阵,在构建过程中并未整合构象信息。有一些基于结构的矩阵,它们是从有限的结构比对数据中推导出来的。利用PDB_SELECT和DSSP数据库,我们创建了一个序列-构象块数据库,该数据库明确表示序列-结构关系。一个块中的成员在构象上相同,在序列上高度相似。从这个块数据库中,我们推导了一个构象特异性氨基酸替换矩阵CBSM60。该矩阵在构象片段搜索和同源物检测方面表现出了更好的性能。