Yamasaki Chisato, Kawashima Hiroaki, Todokoro Fusano, Imamizu Yasuhiro, Ogawa Makoto, Tanino Motohiko, Itoh Takeshi, Gojobori Takashi, Imanishi Tadashi
Integrated Database Group, Japan Biological Information Research Center, Japan Biological Informatics Consortium, AIST Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan.
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W345-9. doi: 10.1093/nar/gkl283.
Transcriptome Auto-annotation Conducting Tool (TACT) is a newly developed web-based automated tool for conducting functional annotation of transcripts by the integration of sequence similarity searches and functional motif predictions. We developed the TACT system by integrating two kinds of similarity searches, FASTY and BLASTX, against protein sequence databases, UniProtKB (Swiss-Prot/TrEMBL) and RefSeq, and a unified motif prediction program, InterProScan, into the ORF-prediction pipeline originally designed for the 'H-Invitational' human transcriptome annotation project. This system successively applies these constituent programs to an mRNA sequence in order to predict the most plausible ORF and the function of the protein encoded. In this study, we applied the TACT system to 19 574 non-redundant human transcripts registered in H-InvDB and evaluated its predictive power by the degree of agreement with human-curated functional annotation in H-InvDB. As a result, the TACT system could assign functional description to 12 559 transcripts (64.2%), the remainder being hypothetical proteins. Furthermore, the overall agreement of functional annotation with H-InvDB, including those transcripts annotated as hypothetical proteins, was 83.9% (16 432/19 574). These results show that the TACT system is useful for functional annotation and that the prediction of ORFs and protein functions is highly accurate and close to the results of human curation. TACT is freely available at http://www.jbirc.aist.go.jp/tact/.
转录组自动注释工具(TACT)是一种新开发的基于网络的自动化工具,通过整合序列相似性搜索和功能基序预测来对转录本进行功能注释。我们通过将两种相似性搜索(针对蛋白质序列数据库UniProtKB(Swiss-Prot/TrEMBL)和RefSeq的FASTY和BLASTX)以及一个统一的基序预测程序InterProScan集成到最初为“H-Invitation”人类转录组注释项目设计的开放阅读框(ORF)预测流程中,开发了TACT系统。该系统依次将这些组成程序应用于mRNA序列,以预测最合理的ORF和所编码蛋白质的功能。在本研究中,我们将TACT系统应用于H-InvDB中登记的19574条非冗余人类转录本,并通过与H-InvDB中人工整理的功能注释的一致程度来评估其预测能力。结果,TACT系统能够为12559条转录本(64.2%)赋予功能描述,其余的为假设蛋白。此外,包括那些被注释为假设蛋白的转录本在内,功能注释与H-InvDB的总体一致性为83.9%(16432/19574)。这些结果表明,TACT系统对于功能注释是有用的,并且ORF和蛋白质功能的预测高度准确,接近人工整理的结果。TACT可在http://www.jbirc.aist.go.jp/tact/免费获取。