The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY.
Gigascience. 2018 Apr 1;7(4):1-11. doi: 10.1093/gigascience/giy015.
The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences.
To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli.
For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).
“泛基因组”的概念是指给定样本或物种中所有基因的总和,在细菌基因组学中已经得到了很好的确立。目前已经有快速且可扩展的流水线来管理和解释来自大量注释组装的泛基因组。然而,尽管有压倒性的证据表明细菌中基因间区的变异可以直接影响表型,但大多数当前分析泛基因组的方法都只专注于蛋白质编码序列。
为了解决这个问题,我们提出了 Piggy,这是一种新的流水线,它模拟了 Roary,但它只基于基因间区。Piggy 提供的一个关键实用程序是检测基因上游高度分化(“切换”)的基因间区(IGR)。我们在临床重要的金黄色葡萄球菌和大肠杆菌谱系的大型数据集上演示了 Piggy 的使用。
对于金黄色葡萄球菌,我们表明高度分化(切换)的 IGR 与基因表达的差异有关,并且我们建立了一个 IGR 等位基因的多基因座参考数据库(igMLST;在 BIGSdb 中实现)。