Wu Sijia, Han Jiuqiang, Zhang Xinman, Zhong Dexing, Liu Ruiling
School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
J Theor Biol. 2017 Jun 21;423:63-70. doi: 10.1016/j.jtbi.2017.04.020. Epub 2017 Apr 26.
Integrase catalytic domain (ICD) is an essential part in the retrovirus for integration reaction, which enables its newly synthesized DNA to be incorporated into the DNA of infected cells. Owing to the crucial role of ICD for the retroviral replication and the absence of an equivalent of integrase in host cells, it is comprehensible that ICD is a promising drug target for therapeutic intervention. However, annotated ICDs in UniProtKB database have still been insufficient for a good understanding of their statistical characteristics so far. Accordingly, it is of great importance to put forward a computational ICD model in this work to annotate these domains in the retroviruses. The proposed model then discovered 11,660 new putative ICDs after scanning sequences without ICD annotations. Subsequently in order to provide much confidence in ICD prediction, it was tested under different cross-validation methods, compared with other database search tools, and verified on independent datasets. Furthermore, an evolutionary analysis performed on the annotated ICDs of retroviruses revealed a tight connection between ICD and retroviral classification. All the datasets involved in this paper and the application software tool of this model can be available for free download at https://sourceforge.net/projects/icdtool/files/?source=navbar.