面向对象的进化泛基因组分析框架。
An object-oriented framework for evolutionary pangenome analysis.
机构信息
Microbial Genomics Laboratory, Institut Pasteur Montevideo, Montevideo, Uruguay.
Center for Innovation in Epidemiological Surveillance, Institut Pasteur Montevideo, Montevideo, Uruguay.
出版信息
Cell Rep Methods. 2021 Sep 27;1(5):100085. doi: 10.1016/j.crmeth.2021.100085.
Pangenome analysis is fundamental to explore molecular evolution occurring in bacterial populations. Here, we introduce Pagoo, an R framework that enables straightforward handling of pangenome data. The encapsulated nature of Pagoo allows the storage of complex molecular and phenotypic information using an object-oriented approach. This facilitates to go back and forward to the data using a single programming environment and saving any stage of analysis (including the raw data) in a single file, making it sharable and reproducible. Pagoo provides tools to query, subset, compare, visualize, and perform statistical analyses, in concert with other microbial genomics packages available in the R ecosystem. As working examples, we used 1,000 genomes to show that Pagoo is scalable, and a global dataset of genomes to identify evolutionary patterns and genomic markers of host-adaptation in this pathogen.
泛基因组分析对于探索细菌群体中的分子进化至关重要。在这里,我们引入了 Pagoo,这是一个 R 框架,可轻松处理泛基因组数据。Pagoo 的封装性质允许使用面向对象的方法存储复杂的分子和表型信息。这使得可以在单个编程环境中来回切换数据,并将分析的任何阶段(包括原始数据)保存在单个文件中,使其具有可共享性和可重复性。Pagoo 提供了查询、子集、比较、可视化和执行统计分析的工具,与 R 生态系统中其他微生物基因组学软件包协同使用。作为工作示例,我们使用了 1000 个基因组来说明 Pagoo 的可扩展性,以及一个全球的基因组数据集,以确定该病原体中宿主适应的进化模式和基因组标记。