PGAP-X:泛基因组分析管道的扩展。
PGAP-X: extension on pan-genome analysis pipeline.
机构信息
CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, People's Republic of China.
University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China.
出版信息
BMC Genomics. 2018 Jan 19;19(Suppl 1):36. doi: 10.1186/s12864-017-4337-7.
BACKGROUND
Since PGAP (pan-genome analysis pipeline) was published in 2012, it has been widely employed in bacterial genomics research. Though PGAP has integrated several modules for pan-genomics analysis, how to properly and effectively interpret and visualize the results data is still a challenge.
RESULT
To well present bacterial genomic characteristics, a novel cross-platform software was developed, named PGAP-X. Four kinds of data analysis modules were developed and integrated: whole genome sequences alignment, orthologous genes clustering, pan-genome profile analysis, and genetic variants analysis. The results from these analyses can be directly visualized in PGAP-X. The modules for data visualization in PGAP-X include: comparison of genome structure, gene distribution by conservation, pan-genome profile curve and variation on genic and genomic region. Meanwhile, result data produced by other programs with similar function can be imported to be further analyzed and visualized in PGAP-X. To test the performance of PGAP-X, we comprehensively analyzed 14 Streptococcus pneumonia strains and 14 Chlamydia trachomatis. The results show that, S. pneumonia strains have higher diversity on genome structure and gene contents than C. trachomatis strains. In addition, S. pneumonia strains might have suffered many evolutionary events, such genomic rearrangements, frequent horizontal gene transfer, homologous recombination, and other evolutionary process.
CONCLUSION
Briefly, PGAP-X directly presents the characteristics of bacterial genomic diversity with different visualization methods, which could help us to intuitively understand dynamics and evolution in bacterial genomes. The source code and the pre-complied executable programs are freely available from http://pgapx.ybzhao.com .
背景
自 2012 年 PGAP(泛基因组分析管道)发布以来,它已被广泛应用于细菌基因组学研究。尽管 PGAP 集成了多个泛基因组分析模块,但如何正确有效地解释和可视化结果数据仍然是一个挑战。
结果
为了更好地呈现细菌基因组特征,开发了一种新型跨平台软件,命名为 PGAP-X。开发并集成了四种数据分析模块:全基因组序列比对、直系同源基因聚类、泛基因组图谱分析和遗传变异分析。这些分析的结果可以直接在 PGAP-X 中可视化。PGAP-X 中的数据可视化模块包括:基因组结构比较、按保守性分布的基因分布、泛基因组图谱曲线以及基因和基因组区域的变化。同时,可以导入其他具有类似功能的程序生成的结果数据,以便在 PGAP-X 中进一步分析和可视化。为了测试 PGAP-X 的性能,我们全面分析了 14 株肺炎链球菌和 14 株沙眼衣原体。结果表明,肺炎链球菌在基因组结构和基因含量上的多样性高于沙眼衣原体。此外,肺炎链球菌可能经历了许多进化事件,如基因组重排、频繁的水平基因转移、同源重组和其他进化过程。
结论
总之,PGAP-X 用不同的可视化方法直接呈现细菌基因组多样性的特征,这有助于我们直观地了解细菌基因组中的动态和进化。源代码和预编译的可执行程序可从 http://pgapx.ybzhao.com 免费获得。