Laboratorio de Biología Computacional, Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay.
Unidad de Genómica Evolutiva, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.
BMC Bioinformatics. 2020 Jul 8;21(1):293. doi: 10.1186/s12859-020-03610-6.
Spliced Leader trans-splicing is an important mechanism for the maturation of mRNAs in several lineages of eukaryotes, including several groups of parasites of great medical and economic importance. Nevertheless, its study across the tree of life is severely hindered by the problem of identifying the SL sequences that are being trans-spliced.
In this paper we present SLFinder, a four-step pipeline meant to identify de novo candidate SL sequences making very few assumptions regarding the SL sequence properties. The pipeline takes transcriptomic de novo assemblies and a reference genome as input and allows the user intervention on several points to account for unexpected features of the dataset. The strategy and its implementation were tested on real RNAseq data from species with and without SL Trans-Splicing.
SLFinder is capable to identify SL candidates with good precision in a reasonable amount of time. It is especially suitable for species with unknown SL sequences, generating candidate sequences for further refining and experimental validation.
拼接 leader 转位拼接是真核生物的几个谱系中 mRNA 成熟的一个重要机制,包括一些具有重要医学和经济意义的寄生虫群。然而,由于缺乏识别正在进行转位拼接的 SL 序列的方法,其在生命之树上的研究受到了严重阻碍。
在本文中,我们提出了 SLFinder,这是一个四步流程,旨在通过很少对 SL 序列特性做出假设的方法来识别从头候选 SL 序列。该流水线以转录组从头组装和参考基因组作为输入,并允许用户在多个点上进行干预,以适应数据集的意外特征。该策略及其实现已在具有和不具有 SL 转位拼接的物种的真实 RNAseq 数据上进行了测试。
SLFinder 能够在合理的时间内以良好的精度识别 SL 候选物。它特别适用于 SL 序列未知的物种,为进一步细化和实验验证生成候选序列。