Scordis P, Flower D R, Attwood T K
School of Biological Sciences, University of Manchester, UK.
Bioinformatics. 1999 Oct;15(10):799-806. doi: 10.1093/bioinformatics/15.10.799.
By identifying an unknown gene or protein as a member of a known family, we can infer a wealth of previously compiled information pertinent to that family and its members.
This paper introduces a method that classifies sequences using familial definitions from the PRINTS database, allowing progress to be made with the identification of distant evolutionary relationships. The approach makes use of the contextual information inherent in a multiple-motif method, and has the power to identify hitherto unidentified relationships in mass genome data. We exemplify our method by a comparison of database searches with uncharacterized sequences from the Caenorhabditis elegans and Saccharomyces cerevisiae genome projects. This analysis tool combines a simple, user-friendly interface with the capacity to provide an 'intelligent', biologically relevant result.
通过将一个未知基因或蛋白质鉴定为一个已知家族的成员,我们可以推断出大量先前收集的与该家族及其成员相关的信息。
本文介绍了一种使用PRINTS数据库中的家族定义对序列进行分类的方法,有助于在识别远缘进化关系方面取得进展。该方法利用了多基序方法中固有的上下文信息,并且有能力在海量基因组数据中识别出迄今未被发现的关系。我们通过将数据库搜索与来自秀丽隐杆线虫和酿酒酵母基因组计划的未表征序列进行比较,例证了我们的方法。这个分析工具将简单、用户友好的界面与提供“智能的”、生物学相关结果的能力结合在一起。