预测蛋白质序列比对中的可靠区域。

Predicting reliable regions in protein sequence alignments.

作者信息

Cline Melissa, Hughey Richard, Karplus Kevin

机构信息

Center for Biomolecular Science and Engineering, Jack Baskin School of Engineering, University of California, Santa Cruz, CA 95064, USA.

出版信息

Bioinformatics. 2002 Feb;18(2):306-14. doi: 10.1093/bioinformatics/18.2.306.

DOI:10.1093/bioinformatics/18.2.306

PMID:11847078

Abstract

MOTIVATION

Protein sequence alignments have a myriad of applications in bioinformatics, including secondary and tertiary structure prediction, homology modeling, and phylogeny. Unfortunately, all alignment methods make mistakes, and mistakes in alignments often yield mistakes in their application. Thus, a method to identify and remove suspect alignment positions could benefit many areas in protein sequence analysis.

RESULTS

We tested four predictors of alignment position reliability, including near-optimal alignment information, column score, and secondary structural information. We validated each predictor against a large library of alignments, removing positions predicted as unreliable. Near-optimal alignment information was the best predictor, removing 70% of the substantially-misaligned positions and 58% of the over-aligned positions, while retaining 86% of those aligned accurately.

摘要