毒理基因组学中的转录组学,第二部分:高质量数据的预处理和差异表达分析

Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data.

作者信息

Federico Antonio, Serra Angela, Ha My Kieu, Kohonen Pekka, Choi Jang-Sik, Liampa Irene, Nymark Penny, Sanabria Natasha, Cattelani Luca, Fratello Michele, Kinaret Pia Anneli Sofia, Jagiello Karolina, Puzyn Tomasz, Melagraki Georgia, Gulumian Mary, Afantitis Antreas, Sarimveis Haralambos, Yoon Tae-Hyun, Grafström Roland, Greco Dario

机构信息

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.

BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.

出版信息

Nanomaterials (Basel). 2020 May 8;10(5):903. doi: 10.3390/nano10050903.

Abstract

Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.

摘要

转录组学数据的预处理在基于毒理基因组学的化学毒性评估工具开发中起着关键作用。按照适当的实验设计生成和利用大量分子图谱,能够运用毒理基因组学(TGx)方法全面表征不同化合物的作用机制(MOA)。迄今为止,已经提出了大量的数据预处理方法。然而,在大多数情况下,构建最佳分析流程并非易事。必须谨慎选择合适的工具,因为这会影响下游分析和建模方法。转录组学数据预处理涵盖多个步骤,如质量检查、过滤、归一化、批次效应检测和校正。目前,TGx领域缺乏数据预处理的标准指南。定义转录组学数据预处理中使用的最佳工具和程序将产生同质化且无偏差的数据,从而有助于开发更可靠、稳健和准确的预测模型。在本综述中,我们概述了三种主要转录组学技术(包括微阵列、批量RNA测序(RNA-Seq)和单细胞RNA测序(scRNA-Seq))的预处理方法。此外,我们还讨论了鉴定差异表达基因和进行功能富集分析的最常用方法。本综述是毒理基因组学中转录组学系列三篇文章的第二篇。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef44/7279140/8e56238a4c18/nanomaterials-10-00903-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索