毒理基因组学中的转录组学，第二部分：高质量数据的预处理和差异表达分析

Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data.

作者信息

Federico Antonio, Serra Angela, Ha My Kieu, Kohonen Pekka, Choi Jang-Sik, Liampa Irene, Nymark Penny, Sanabria Natasha, Cattelani Luca, Fratello Michele, Kinaret Pia Anneli Sofia, Jagiello Karolina, Puzyn Tomasz, Melagraki Georgia, Gulumian Mary, Afantitis Antreas, Sarimveis Haralambos, Yoon Tae-Hyun, Grafström Roland, Greco Dario

机构信息

Faculty of Medicine and Health Technology, Tampere University, FI-33014 Tampere, Finland.

BioMediTech Institute, Tampere University, FI-33014 Tampere, Finland.

出版信息

Nanomaterials (Basel). 2020 May 8;10(5):903. doi: 10.3390/nano10050903.

DOI:10.3390/nano10050903

PMID:32397130

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7279140/

Abstract

Preprocessing of transcriptomics data plays a pivotal role in the development of toxicogenomics-driven tools for chemical toxicity assessment. The generation and exploitation of large volumes of molecular profiles, following an appropriate experimental design, allows the employment of toxicogenomics (TGx) approaches for a thorough characterisation of the mechanism of action (MOA) of different compounds. To date, a plethora of data preprocessing methodologies have been suggested. However, in most cases, building the optimal analytical workflow is not straightforward. A careful selection of the right tools must be carried out, since it will affect the downstream analyses and modelling approaches. Transcriptomics data preprocessing spans across multiple steps such as quality check, filtering, normalization, batch effect detection and correction. Currently, there is a lack of standard guidelines for data preprocessing in the TGx field. Defining the optimal tools and procedures to be employed in the transcriptomics data preprocessing will lead to the generation of homogeneous and unbiased data, allowing the development of more reliable, robust and accurate predictive models. In this review, we outline methods for the preprocessing of three main transcriptomic technologies including microarray, bulk RNA-Sequencing (RNA-Seq), and single cell RNA-Sequencing (scRNA-Seq). Moreover, we discuss the most common methods for the identification of differentially expressed genes and to perform a functional enrichment analysis. This review is the second part of a three-article series on Transcriptomics in Toxicogenomics.

摘要

转录组学数据的预处理在基于毒理基因组学的化学毒性评估工具开发中起着关键作用。按照适当的实验设计生成和利用大量分子图谱，能够运用毒理基因组学（TGx）方法全面表征不同化合物的作用机制（MOA）。迄今为止，已经提出了大量的数据预处理方法。然而，在大多数情况下，构建最佳分析流程并非易事。必须谨慎选择合适的工具，因为这会影响下游分析和建模方法。转录组学数据预处理涵盖多个步骤，如质量检查、过滤、归一化、批次效应检测和校正。目前，TGx领域缺乏数据预处理的标准指南。定义转录组学数据预处理中使用的最佳工具和程序将产生同质化且无偏差的数据，从而有助于开发更可靠、稳健和准确的预测模型。在本综述中，我们概述了三种主要转录组学技术（包括微阵列、批量RNA测序（RNA-Seq）和单细胞RNA测序（scRNA-Seq））的预处理方法。此外，我们还讨论了鉴定差异表达基因和进行功能富集分析的最常用方法。本综述是毒理基因组学中转录组学系列三篇文章的第二篇。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ef44/7279140/8e56238a4c18/nanomaterials-10-00903-g001.jpg

相似文献

Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data.毒理基因组学中的转录组学，第二部分：高质量数据的预处理和差异表达分析

Nanomaterials (Basel). 2020 May 8;10(5):903. doi: 10.3390/nano10050903.

Transcriptomics in Toxicogenomics, Part I: Experimental Design, Technologies, Publicly Available Data, and Regulatory Aspects.毒理基因组学中的转录组学，第一部分：实验设计、技术、公开可用数据及监管方面

Nanomaterials (Basel). 2020 Apr 15;10(4):750. doi: 10.3390/nano10040750.

Transcriptomics in Toxicogenomics, Part III: Data Modelling for Risk Assessment.毒理基因组学中的转录组学，第三部分：风险评估的数据建模

Nanomaterials (Basel). 2020 Apr 8;10(4):708. doi: 10.3390/nano10040708.

scNPF: an integrative framework assisted by network propagation and network fusion for preprocessing of single-cell RNA-seq data.scNPF：一种基于网络传播和网络融合的综合框架，用于单细胞 RNA-seq 数据的预处理。

BMC Genomics. 2019 May 8;20(1):347. doi: 10.1186/s12864-019-5747-5.

Nextcast: A software suite to analyse and model toxicogenomics data.Nextcast：一个用于分析和建模毒理基因组学数据的软件套件。

Comput Struct Biotechnol J. 2022 Mar 18;20:1413-1426. doi: 10.1016/j.csbj.2022.03.014. eCollection 2022.

Microarray Data Preprocessing: From Experimental Design to Differential Analysis.微阵列数据分析：从实验设计到差异分析。

Methods Mol Biol. 2022;2401:79-100. doi: 10.1007/978-1-0716-1839-4_7.

Evaluation of Bioinformatics Approaches for Next-Generation Sequencing Analysis of microRNAs with a Toxicogenomics Study Design.采用毒理基因组学研究设计对用于微小RNA下一代测序分析的生物信息学方法进行评估。

Front Genet. 2018 Feb 6;9:22. doi: 10.3389/fgene.2018.00022. eCollection 2018.

FastqPuri: high-performance preprocessing of RNA-seq data.FastqPuri：RNA-seq 数据的高性能预处理。

BMC Bioinformatics. 2019 May 3;20(1):226. doi: 10.1186/s12859-019-2799-0.

How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives.如何设计单细胞 RNA 测序实验：陷阱、挑战和展望。

Brief Bioinform. 2019 Jul 19;20(4):1384-1394. doi: 10.1093/bib/bby007.

RNA-seq preprocessing and sample size considerations for gene network inference.用于基因网络推断的RNA测序预处理及样本量考量

bioRxiv. 2023 Jan 3:2023.01.02.522518. doi: 10.1101/2023.01.02.522518.

引用本文的文献

Advancing chemical safety assessment through an omics-based characterization of the test system-chemical interaction.通过基于组学的测试系统-化学物质相互作用表征推进化学安全性评估。

Front Toxicol. 2023 Nov 9;5:1294780. doi: 10.3389/ftox.2023.1294780. eCollection 2023.

A curated gene and biological system annotation of adverse outcome pathways related to human health.精心策划的与人类健康相关的不良结局途径的基因和生物系统注释。

Sci Data. 2023 Jun 24;10(1):409. doi: 10.1038/s41597-023-02321-w.

Transcriptomic Analysis of Diethylstilbestrol in Daphnia Magna: Energy Metabolism and Growth Inhibition.大型溞中己烯雌酚的转录组学分析：能量代谢与生长抑制

Toxics. 2023 Feb 20;11(2):197. doi: 10.3390/toxics11020197.

Data-driven analysis and druggability assessment methods to accelerate the identification of novel cancer targets.数据驱动分析和药物可及性评估方法以加速新型癌症靶点的识别。

Comput Struct Biotechnol J. 2022 Nov 24;21:46-57. doi: 10.1016/j.csbj.2022.11.042. eCollection 2023.

The potential of a data centred approach & knowledge graph data representation in chemical safety and drug design.以数据为中心的方法和知识图谱数据表示在化学安全与药物设计中的潜力。

Comput Struct Biotechnol J. 2022 Sep 5;20:4837-4849. doi: 10.1016/j.csbj.2022.08.061. eCollection 2022.

Characterization of ENM Dynamic Dose-Dependent MOA in Lung with Respect to Immune Cells Infiltration.关于免疫细胞浸润的肺中纳米工程材料（ENM）动态剂量依赖性作用机制的表征

Nanomaterials (Basel). 2022 Jun 13;12(12):2031. doi: 10.3390/nano12122031.

Comparative Toxicotranscriptomics of Single Cell RNA-Seq and Conventional RNA-Seq in TCDD-Exposed Testicular Tissue.TCDD 暴露睾丸组织中单细胞 RNA 测序与传统 RNA 测序的比较毒理转录组学

Front Toxicol. 2022 May 9;4:821116. doi: 10.3389/ftox.2022.821116. eCollection 2022.

Nextcast: A software suite to analyse and model toxicogenomics data.Nextcast：一个用于分析和建模毒理基因组学数据的软件套件。

Comput Struct Biotechnol J. 2022 Mar 18;20:1413-1426. doi: 10.1016/j.csbj.2022.03.014. eCollection 2022.

Analysis of Nanotoxicity with Integrated Omics and Mechanobiology.综合组学与力学生物学的纳米毒性分析

Nanomaterials (Basel). 2021 Sep 13;11(9):2385. doi: 10.3390/nano11092385.

Advances in de Novo Drug Design: From Conventional to Machine Learning Methods.从头药物设计的进展：从传统方法到机器学习方法。

Int J Mol Sci. 2021 Feb 7;22(4):1676. doi: 10.3390/ijms22041676.

本文引用的文献

The art of using t-SNE for single-cell transcriptomics.使用 t-SNE 进行单细胞转录组学分析的艺术。

Nat Commun. 2019 Nov 28;10(1):5416. doi: 10.1038/s41467-019-13056-x.

A comparison of automatic cell identification methods for single-cell RNA sequencing data.单细胞 RNA 测序数据的自动细胞识别方法比较。

Genome Biol. 2019 Sep 9;20(1):194. doi: 10.1186/s13059-019-1795-z.

Current best practices in single-cell RNA-seq analysis: a tutorial.单细胞 RNA 测序分析的当前最佳实践：教程。

Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.

DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors.DoubletFinder：基于人工最近邻算法检测单细胞 RNA 测序数据中的双细胞。

Cell Syst. 2019 Apr 24;8(4):329-337.e4. doi: 10.1016/j.cels.2019.03.003. Epub 2019 Apr 3.

The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads.Rsubread 软件包在 RNA 测序reads 的比对和定量方面，具有更简单、更快、更便宜和更好的优势。

Nucleic Acids Res. 2019 May 7;47(8):e47. doi: 10.1093/nar/gkz114.

FunMappOne: a tool to hierarchically organize and visually navigate functional gene annotations in multiple experiments.FunMappOne：一种用于在多个实验中对功能基因注释进行层次组织和可视化导航的工具。

BMC Bioinformatics. 2019 Feb 15;20(1):79. doi: 10.1186/s12859-019-2639-2.

eUTOPIA: solUTion for Omics data PreprocessIng and Analysis.eUTOPIA：组学数据预处理与分析的解决方案。

Source Code Biol Med. 2019 Jan 29;14:1. doi: 10.1186/s13029-019-0071-7. eCollection 2019.

Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap.使用 g:Profiler、GSEA、Cytoscape 和 EnrichmentMap 进行组学数据的通路富集分析和可视化。

Nat Protoc. 2019 Feb;14(2):482-517. doi: 10.1038/s41596-018-0103-9.

A test metric for assessing single-cell RNA-seq batch correction.一种用于评估单细胞 RNA-seq 批次校正的测试指标。

Nat Methods. 2019 Jan;16(1):43-49. doi: 10.1038/s41592-018-0254-1. Epub 2018 Dec 20.

Dimensionality reduction for visualizing single-cell data using UMAP.使用UMAP进行单细胞数据可视化的降维方法。

Nat Biotechnol. 2018 Dec 3. doi: 10.1038/nbt.4314.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

毒理基因组学中的转录组学，第二部分：高质量数据的预处理和差异表达分析

Transcriptomics in Toxicogenomics, Part II: Preprocessing and Differential Expression Analysis for High Quality Data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献