Viroscience Department, Erasmus Medical Center, Doctor Molewaterplein 40, 3015 GD Rotterdam, The Netherlands.
Viruses. 2021 Mar 9;13(3):437. doi: 10.3390/v13030437.
Experiments in which complex virome sequencing data is generated remain difficult to explore and unpack for scientists without a background in data science. The processing of raw sequencing data by high throughput sequencing workflows usually results in contigs in FASTA format coupled to an annotation file linking the contigs to a reference sequence or taxonomic identifier. The next step is to compare the virome of different samples based on the metadata of the experimental setup and extract sequences of interest that can be used in subsequent analyses. The viromeBrowser is an application written in the opensource R shiny framework that was developed in collaboration with end-users and is focused on three common data analysis steps. First, the application allows interactive filtering of annotations by default or custom quality thresholds. Next, multiple samples can be visualized to facilitate comparison of contig annotations based on sample specific metadata values. Last, the application makes it easy for users to extract sequences of interest in FASTA format. With the interactive features in the viromeBrowser we aim to enable scientists without a data science background to compare and extract annotation data and sequences from virome sequencing analysis results.
对于没有数据科学背景的科学家来说,实验中产生的复杂病毒组测序数据仍然难以探索和解读。高通量测序工作流程对原始测序数据的处理通常会生成 FASTA 格式的 contigs,并附有一个注释文件,将 contigs 链接到参考序列或分类标识符。下一步是根据实验设置的元数据比较不同样本的病毒组,并提取可用于后续分析的感兴趣的序列。viromeBrowser 是一个用开源 R shiny 框架编写的应用程序,它是与终端用户合作开发的,专注于三个常见的数据分析步骤。首先,该应用程序允许通过默认或自定义质量阈值进行注释的交互式过滤。接下来,可以可视化多个样本,以便根据样本特定的元数据值方便地比较 contig 注释。最后,该应用程序使用户可以轻松地以 FASTA 格式提取感兴趣的序列。通过 viromeBrowser 中的交互功能,我们旨在使没有数据科学背景的科学家能够比较和提取病毒组测序分析结果中的注释数据和序列。