Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany.
Nucleic Acids Res. 2010 Jan;38(Database issue):D223-6. doi: 10.1093/nar/gkp949. Epub 2009 Nov 11.
The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).
利用序列比较进行蛋白质功能预测以及重建进化起源仍然是序列分析中最强大的工具。由于已知蛋白质序列数量的指数级增长以及相似性矩阵的二次增长,蛋白质相似性矩阵(SIMAP)的计算成为一项计算密集型任务。SIMAP 数据库提供了蛋白质序列相似性矩阵、基于序列的特征和序列聚类的全面和最新的预计算。截至 2009 年 9 月,SIMAP 涵盖了 4800 万种蛋白质和超过 2300 万种非冗余序列。SIMAP 的新功能包括通过包含 ENSEMBL 等数据库来扩展序列空间,以及基于一致处理和注释整合宏基因组。此外,Blast2GO 对 SIMAP 中的所有序列进行了蛋白质功能预测的预计算,并改进了数据访问和查询功能。SIMAP 协助生物学家系统地查询最新的序列空间,并为计算生物学中的大规模下游项目提供便利。通过个人网页门户(http://mips.gsf.de/simap/)以及 DAS(http://webclu.bio.wzw.tum.de/das/)和 Web 服务(http://mips.gsf.de/webservices/services/SimapService2.0?wsdl)提供对 SIMAP 的免费访问。