Pereira Mariana Buongermino, Wallroth Mikael, Kristiansson Erik, Axelson-Fisk Marina
1 Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg , Gothenburg, Sweden .
2 Centre for Antibiotic Resistance Research (CARe) at University of Gothenburg, Gothenburg, Sweden .
J Comput Biol. 2016 Nov;23(11):891-902. doi: 10.1089/cmb.2016.0024. Epub 2016 Jul 18.
Integrons are genetic elements that facilitate the horizontal gene transfer in bacteria and are known to harbor genes associated with antibiotic resistance. The gene mobility in the integrons is governed by the presence of attC sites, which are 55 to 141-nucleotide-long imperfect inverted repeats. Here we present HattCI, a new method for fast and accurate identification of attC sites in large DNA data sets. The method is based on a generalized hidden Markov model that describes each core component of an attC site individually. Using twofold cross-validation experiments on a manually curated reference data set of 231 attC sites from class 1 and 2 integrons, HattCI showed high sensitivities of up to 91.9% while maintaining satisfactory false-positive rates. When applied to a metagenomic data set of 35 microbial communities from different environments, HattCI found a substantially higher number of attC sites in the samples that are known to contain more horizontally transferred elements. HattCI will significantly increase the ability to identify attC sites and thus integron-mediated genes in genomic and metagenomic data. HattCI is implemented in C and is freely available at http://bioinformatics.math.chalmers.se/HattCI .
整合子是促进细菌中水平基因转移的遗传元件,已知其携带与抗生素抗性相关的基因。整合子中的基因移动性由attC位点的存在所控制,attC位点是长度为55至141个核苷酸的不完全反向重复序列。在此,我们介绍HattCI,这是一种用于在大型DNA数据集中快速准确识别attC位点的新方法。该方法基于广义隐马尔可夫模型,该模型分别描述了attC位点的每个核心组件。在一个由来自1类和2类整合子的231个attC位点组成的人工整理的参考数据集上进行的双重交叉验证实验中,HattCI显示出高达91.9%的高灵敏度,同时保持了令人满意的假阳性率。当应用于来自不同环境的35个微生物群落的宏基因组数据集时,HattCI在已知含有更多水平转移元件的样本中发现了大量更多的attC位点。HattCI将显著提高在基因组和宏基因组数据中识别attC位点以及整合子介导的基因的能力。HattCI用C语言实现,可在http://bioinformatics.math.chalmers.se/HattCI免费获取。