Valdes Camilo, Stebliankin Vitalii, Ruiz-Perez Daniel, Park Ji In, Lee Hajeong, Narasimhan Giri
Lawrence Livermore National Laboratory, Physical and Life Sciences Directorate, Livermore, CA, United States.
Bioinformatics Research Group (BioRG), Florida International University, Miami, FL, United States.
Front Bioinform. 2023 Jun 19;3:1154588. doi: 10.3389/fbinf.2023.1154588. eCollection 2023.
Abundance profiles from metagenomic sequencing data synthesize information from billions of sequenced reads coming from thousands of microbial genomes. Analyzing and understanding these profiles can be a challenge since the data they represent are complex. Particularly challenging is their visualization, as existing techniques are inadequate when the taxa number is in the thousands. We present a technique, and accompanying software, for the visualization of metagenomic abundance profiles using a space-filling curve that transforms a profile into an interactive 2D image. We created Jasper, an easy to use tool for the visualization and exploration of metagenomic profiles from DNA sequencing data. It orders taxa using a space-filling Hilbert curve, and creates a "", where each position in the image represents the abundance of a single taxon from a reference collection. Jasper can order taxa in multiple ways, and the resulting can highlight "hot spots" of microbes that are dominant in taxonomic clades or biological conditions. We use Jasper to visualize samples from a variety of microbiome studies, and discuss ways in which can be an invaluable tool to visualize spatial, temporal, disease, and differential profiles. Our approach can create detailed involving hundreds of thousands of microbial reference genomes with the potential to unravel latent relationships (taxonomic, spatio-temporal, functional, and other) that could remain hidden using traditional visualization techniques. The maps can also be converted into animated movies that bring to life the dynamicity of microbiomes.
宏基因组测序数据的丰度图谱综合了来自数千个微生物基因组的数十亿条测序读数中的信息。分析和理解这些图谱可能具有挑战性,因为它们所代表的数据很复杂。特别具有挑战性的是它们的可视化,因为当分类单元数量达到数千个时,现有技术并不适用。我们提出了一种技术及配套软件,用于使用空间填充曲线对宏基因组丰度图谱进行可视化,该曲线可将图谱转换为交互式二维图像。我们创建了Jasper,这是一个易于使用的工具,用于可视化和探索来自DNA测序数据的宏基因组图谱。它使用空间填充希尔伯特曲线对分类单元进行排序,并创建一个“图谱”,其中图像中的每个位置代表参考集合中单个分类单元的丰度。Jasper可以通过多种方式对分类单元进行排序,生成的图谱可以突出显示在分类进化枝或生物学条件中占主导地位的微生物“热点”。我们使用Jasper对来自各种微生物组研究的样本进行可视化,并讨论该图谱如何成为可视化空间、时间、疾病和差异图谱的宝贵工具。我们的方法可以创建涉及数十万个微生物参考基因组的详细图谱,有可能揭示使用传统可视化技术可能隐藏的潜在关系(分类学、时空、功能等)。这些图谱还可以转换为动画电影,展现微生物组的动态特性。