Jiménez-Jacinto Verónica, Sanchez-Flores Alejandro, Vega-Alvarado Leticia
Unidad Universitaria de Secuenciación Masiva y Bioinformática, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Mexico.
Instituto de Ciencias Aplicadas y Tecnología, Universidad Nacional Autónoma de México, Mexico City, Mexico.
Front Genet. 2019 Mar 29;10:279. doi: 10.3389/fgene.2019.00279. eCollection 2019.
The current DNA sequencing technologies and their high-throughput yield, allowed the thrive of genomic and transcriptomic experiments but it also have generated big data problem. Due to this exponential growth of sequencing data, also the complexity of managing, processing and interpreting it in order to generate results, has raised. Therefore, the demand of easy-to-use friendly software and websites to run bioinformatic tools is imminent. In particular, RNA-Seq and differential expression analysis have become a popular and useful method to evaluate the genetic expression change in any organism. However, many scientists struggle with the data analysis since most of the available tools are implemented in a UNIX-based environment. Therefore, we have developed the web server IDEAMEX (Integrative Differential Expression Analysis for Multiple EXperiments). The IDEAMEX pipeline needs a raw count table for as many desired replicates and conditions, allowing the user to select which conditions will be compared, instead of doing all-vs.-all comparisons. The whole process consists of three main steps (1) Data Analysis: that allows a preliminary analysis for quality control based on the data distribution per sample, using different types of graphs; (2) Differential expression: performs the differential expression analysis with or without batch effect error awareness, using the bioconductor packages, NOISeq, limma-Voom, DESeq2 and edgeR, and generate reports for each method; (3) Result integration: the obtained results the integrated results are reported using different graphical outputs such as correlograms, heatmaps, Venn diagrams and text lists. Our server allows an easy and friendly visualization for results, providing an easy interaction during the analysis process, as well as error tracking and debugging by providing output log files. The server is currently available and can be accessed at http://www.uusmb.unam.mx/ideamex/ where the documentation and example input files are provided. We consider that this web server can help other researchers with no previous bioinformatic knowledge, to perform their analyses in a simple manner.
当前的DNA测序技术及其高通量产出,推动了基因组和转录组实验的蓬勃发展,但也产生了大数据问题。由于测序数据呈指数级增长,管理、处理和解释这些数据以得出结果的复杂性也随之增加。因此,迫切需要易于使用的友好软件和网站来运行生物信息学工具。特别是,RNA测序和差异表达分析已成为评估任何生物体基因表达变化的一种流行且有用的方法。然而,许多科学家在数据分析方面面临困难,因为大多数可用工具是在基于UNIX的环境中实现的。因此,我们开发了网络服务器IDEAMEX(多实验综合差异表达分析)。IDEAMEX流程需要一个包含所需重复样本和条件的原始计数表,允许用户选择要比较的条件,而不是进行全对全比较。整个过程包括三个主要步骤:(1)数据分析:基于每个样本的数据分布,使用不同类型的图表进行初步质量控制分析;(2)差异表达:使用生物导体包、NOISeq、limma-Voom、DESeq2和edgeR进行有或无批次效应误差意识的差异表达分析,并为每种方法生成报告;(3)结果整合:使用不同的图形输出(如相关图、热图、维恩图和文本列表)报告获得的结果和整合结果。我们的服务器允许对结果进行轻松友好的可视化,在分析过程中提供便捷的交互,并通过提供输出日志文件进行错误跟踪和调试。该服务器目前可用,可通过http://www.uusmb.unam.mx/ideamex/访问,该网站提供了文档和示例输入文件。我们认为这个网络服务器可以帮助其他没有生物信息学知识的研究人员以简单的方式进行分析。