Sahraeian Sayed M, Luo Kevin R, Brenner Steven E
Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA.
Nucleic Acids Res. 2015 Jul 1;43(W1):W141-7. doi: 10.1093/nar/gkv461. Epub 2015 May 15.
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access to precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. The SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.
通过高通量测序项目发现的蛋白质数量众多。由于其中只有极小一部分经过了实验表征,因此计算方法被广泛用于自动注释。在此,我们推出了一个用户友好的网络界面,用于使用SIFTER算法进行准确的蛋白质功能预测。SIFTER是一种基于序列的最先进的基因分子功能预测算法,它使用功能进化的统计模型将注释纳入整个系统发育树。由于SIFTER算法所需的资源,对于大多数用户来说,在本地运行SIFTER并非易事,尤其是对于大规模问题。因此,SIFTER网络服务器提供了对来自232403个物种的16863537种蛋白质的预计算预测的访问。用户可以通过对预计算预测集中未包含的序列的蛋白质、物种、功能和同源物进行查询来探索SIFTER预测。可通过http://sifter.berkeley.edu/访问SIFTER网络服务器,其源代码也可下载。