Espindula Eliandro, Sperb Edilena Reis, Bach Evelise, Passaglia Luciane Maria Pereira
Universidade Federal do Rio Grande do Sul (UFRGS), Instituto de Biociências, Departamento de Genética, Porto Alegre, RS, Brazil.
Genet Mol Biol. 2020 Feb 10;42(4):e20190215. doi: 10.1590/1678-4685-GMB-2019-0215. eCollection 2019.
In Dual RNA-Seq experiments the simultaneous extraction of RNA and analysis of gene expression data from both interacting organisms could be a challenge. One alternative is separating the reads during in silico data analysis. There are two main mapping methods used: sequential and combined. Here we present a combined approach in which the libraries were aligned to a concatenated genome to sort the reads before mapping them to the respective annotated genomes. A comparison of this method with the sequential analysis was performed. Two RNA-Seq libraries available in public databases consisting of a eukaryotic (Zea mays) and a prokaryotic (Herbaspirillum seropediceae) organisms were mixed to simulate a Dual RNA-Seq experiment. Libraries from real Dual RNA-Seq experiments were also used. The sequential analysis consistently attributed more reads to the first reference genome used in the analysis (due to cross-mapping) than the combined approach. More importantly, the combined analysis resulted in lower numbers of cross-mapped reads. Our results highlight the necessity of combining the reference genomes to sort reads previously to the counting step to avoid losing information in Dual RNA-Seq experiments. Since most studies first map the RNA-Seq libraries to the eukaryotic genome, much prokaryotic information has probably been lost.
在双RNA测序实验中,同时从相互作用的两个生物体中提取RNA并分析基因表达数据可能具有挑战性。一种替代方法是在计算机数据分析过程中对 reads 进行分离。主要使用两种映射方法:顺序映射和联合映射。在这里,我们提出了一种联合方法,其中将文库与拼接的基因组进行比对,以便在将 reads 映射到各自的注释基因组之前对其进行分类。我们将此方法与顺序分析进行了比较。将公共数据库中可用的两个RNA测序文库(一个由真核生物(玉米)和一个原核生物(巴西固氮螺菌)组成)混合,以模拟双RNA测序实验。也使用了来自实际双RNA测序实验的文库。与联合方法相比,顺序分析始终将更多的 reads 归因于分析中使用的第一个参考基因组(由于交叉映射)。更重要的是,联合分析导致交叉映射的 reads 数量减少。我们的结果强调了在计数步骤之前组合参考基因组以对 reads 进行分类的必要性,以避免在双RNA测序实验中丢失信息。由于大多数研究首先将RNA测序文库映射到真核基因组,许多原核生物信息可能已经丢失。