Casimiro-Soriguer Carlos S, Muñoz-Mérida Antonio, Pérez-Pulido Antonio J
Centro Andaluz de Biología del Desarrollo (CABD-CSIC-JA), Universidad Pablo de Olavide, Sevilla, Spain.
CIBIO-InBIO, Research Network in Biodiversity and Evolutionary Biology, Universidade do Porto, Vairão, Portugal.
Proteomics. 2017 Jun;17(12). doi: 10.1002/pmic.201700071.
The current cheapening of next-generation sequencing has led to an enormous growth in the number of sequenced genomes and transcriptomes, allowing wet labs to get the sequences from their organisms of study. To make the most of these data, one of the first things that should be done is the functional annotation of the protein-coding genes. But it used to be a slow and tedious step that can involve the characterization of thousands of sequences. Sma3s is an accurate computational tool for annotating proteins in an unattended way. Now, we have developed a completely new version, which includes functionalities that will be of utility for fundamental and applied science. Currently, the results provide functional categories such as biological processes, which become useful for both characterizing particular sequence datasets and comparing results from different projects. But one of the most important implemented innovations is that it has now low computational requirements, and the complete annotation of a simple proteome or transcriptome usually takes around 24 hours in a personal computer. Sma3s has been tested with a large amount of complete proteomes and transcriptomes, and it has demonstrated its potential in health science and other specific projects.
目前,新一代测序成本的降低使得已测序的基因组和转录组数量大幅增长,这让湿实验室能够获取其研究生物体的序列。为了充分利用这些数据,首先要做的事情之一就是对蛋白质编码基因进行功能注释。但这曾经是一个缓慢而繁琐的步骤,可能涉及对数以千计的序列进行表征。Sma3s是一种用于自动注释蛋白质的精确计算工具。现在,我们开发了一个全新版本,它包含了对基础科学和应用科学都有用的功能。目前,结果提供了诸如生物过程等功能类别,这对于表征特定序列数据集以及比较不同项目的结果都很有用。但最重要的一项创新是,它现在对计算资源的要求较低,在个人电脑上,一个简单蛋白质组或转录组的完整注释通常大约需要24小时。Sma3s已经在大量完整蛋白质组和转录组上进行了测试,并在健康科学和其他特定项目中展示了其潜力。