BCCM/IHEM, Mycology and Aerobiology, Sciensano, 1050 Bruxelles, Belgium.
BCCM/ULC Collection, InBioS-Centre for Protein Engineering, University of Liège, 4000 Liège, Belgium.
Genes (Basel). 2021 Oct 29;12(11):1741. doi: 10.3390/genes12111741.
The continuous increase in sequenced genomes in public repositories makes the choice of interesting bacterial strains for future sequencing projects ever more complicated, as it is difficult to estimate the redundancy between these strains and the already available genomes. Therefore, we developed the Nextflow workflow "ORPER", for "ORganism PlacER", containerized in Singularity, which allows the determination the phylogenetic position of a collection of organisms in the genomic landscape. ORPER constrains the phylogenetic placement of SSU (16S) rRNA sequences in a multilocus reference tree based on ribosomal protein genes extracted from public genomes. We demonstrate the utility of ORPER on the Cyanobacteria phylum, by placing 152 strains of the BCCM/ULC collection.
公共存储库中不断增加的测序基因组使得选择未来测序项目中有趣的细菌菌株变得更加复杂,因为很难估计这些菌株与已经可用的基因组之间的冗余度。因此,我们开发了 Nextflow 工作流程“ORPER”,即“Organism PlacER”,它被容器化为 Singularity,允许在基因组景观中确定一组生物体的系统发育位置。ORPER 根据从公共基因组中提取的核糖体蛋白基因,在多基因参考树中约束小亚基 (SSU) (16S) rRNA 序列的系统发育位置。我们通过放置 ULC 集合的 152 株蓝细菌来证明 ORPER 的实用性。