Suppr超能文献

panRGP:一种基于泛基因组的方法,用于预测基因组岛并探索其多样性。

panRGP: a pangenome-based method to predict genomic islands and explore their diversity.

机构信息

LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d'Évry, Université Paris-Saclay, CNRS, Evry, France.

出版信息

Bioinformatics. 2020 Dec 30;36(Suppl_2):i651-i658. doi: 10.1093/bioinformatics/btaa792.

Abstract

MOTIVATION

Horizontal gene transfer (HGT) is a major source of variability in prokaryotic genomes. Regions of genome plasticity (RGPs) are clusters of genes located in highly variable genomic regions. Most of them arise from HGT and correspond to genomic islands (GIs). The study of those regions at the species level has become increasingly difficult with the data deluge of genomes. To date, no methods are available to identify GIs using hundreds of genomes to explore their diversity.

RESULTS

We present here the panRGP method that predicts RGPs using pangenome graphs made of all available genomes for a given species. It allows the study of thousands of genomes in order to access the diversity of RGPs and to predict spots of insertions. It gave the best predictions when benchmarked along other GI detection tools against a reference dataset. In addition, we illustrated its use on metagenome assembled genomes by redefining the borders of the leuX tRNA hotspot, a well-studied spot of insertion in Escherichia coli. panRPG is a scalable and reliable tool to predict GIs and spots making it an ideal approach for large comparative studies.

AVAILABILITY AND IMPLEMENTATION

The methods presented in the current work are available through the following software: https://github.com/labgem/PPanGGOLiN. Detailed results and scripts to compute the benchmark metrics are available at https://github.com/axbazin/panrgp_supdata.

摘要

动机

水平基因转移(HGT)是原核基因组变异性的主要来源。基因组可塑性区域(RGPs)是位于高度可变基因组区域的基因簇。它们中的大多数来自 HGT,并对应于基因组岛(GIs)。随着基因组数据的大量涌现,对这些区域在物种水平上的研究变得越来越困难。迄今为止,还没有方法可以使用数百个基因组来识别 GIs,以探索它们的多样性。

结果

我们在这里提出了 panRGP 方法,该方法使用针对给定物种的所有可用基因组构建的泛基因组图来预测 RGPs。它允许研究数千个基因组,以访问 RGPs 的多样性并预测插入点。当与参考数据集的其他 GI 检测工具进行基准测试时,它给出了最佳预测。此外,我们通过重新定义 leuX tRNA 热点的边界,说明了它在宏基因组组装基因组上的用途,leuX tRNA 热点是大肠杆菌中研究较好的插入点。panRPG 是一种可扩展且可靠的预测 GIs 和插入点的工具,因此是进行大型比较研究的理想方法。

可用性和实现

当前工作中提出的方法可通过以下软件获得:https://github.com/labgem/PPanGGOLiN。详细的结果和计算基准指标的脚本可在 https://github.com/axbazin/panrgp_supdata 上获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验