Bio-Sciences Division, Innovation Labs, Tata Consultancy Services, 1 Software Units Layout, Hyderabad 500 081, India.
J Biosci. 2010 Sep;35(3):351-64. doi: 10.1007/s12038-010-0040-4.
Genomic islands (GIs) are regions in the genome which are believed to have been acquired via horizontal gene transfer events and are thus likely to be compositionally distinct from the rest of the genome. Majority of the genes located in a GI encode a particular function. Depending on the genes they encode, GIs can be classified into various categories, such as 'metabolic islands', 'symbiotic islands', 'resistance islands', 'pathogenicity islands', etc. The computational process for GI detection is known and many algorithms for the same are available. We present a new method termed as Improved N-mer based Detection of Genomic Islands Using Sequence-clustering (INDeGenIUS) for the identification of GIs. This method was applied to 400 completely sequenced species belonging to proteobacteria. Based on the genes encoded in the identified GIs, the GIs were grouped into 6 categories: metabolic islands, symbiotic islands, resistance islands, secretion islands, pathogenicity islands and motility islands. Several new islands of interest which had previously been missed out by earlier algorithms were picked up as GIs by INDeGenIUS. The present algorithm has potential application in the identification of functionally relevant GIs in the large number of genomes that are being sequenced. Investigation of the predicted GIs in pathogens may lead to identification of potential drug/vaccine candidates.
基因组岛 (GI) 被认为是通过水平基因转移事件获得的基因组区域,因此其组成可能与基因组的其他部分不同。大多数位于 GI 中的基因编码特定的功能。根据它们编码的基因,GI 可以分为不同的类别,如“代谢岛”、“共生岛”、“抗性岛”、“致病性岛”等。GI 检测的计算过程是已知的,并且有许多用于相同目的的算法。我们提出了一种新的方法,称为基于改进的 N -mer 的使用序列聚类的基因组岛检测(INDeGenIUS),用于识别 GI。该方法应用于 400 种属于变形菌的完全测序的物种。根据鉴定的 GI 中编码的基因,将 GI 分为 6 类:代谢岛、共生岛、抗性岛、分泌岛、致病性岛和运动岛。INDeGenIUS 检测到了一些以前被早期算法漏掉的新的感兴趣的岛,这些岛被鉴定为 GI。该算法在鉴定大量正在测序的基因组中具有功能相关性的 GI 方面具有潜在的应用。对病原体中预测的 GI 的研究可能会导致潜在药物/疫苗候选物的鉴定。