Suppr超能文献

MultiBaC:一个用于去除多组学实验中批次效应的 R 包。

MultiBaC: an R package to remove batch effects in multi-omic experiments.

机构信息

Gene Expression and RNA Metabolism Laboratory, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia 46010, Spain.

Multivariate Statistical Engineering Group, Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia 46022, Spain.

出版信息

Bioinformatics. 2022 Apr 28;38(9):2657-2658. doi: 10.1093/bioinformatics/btac132.

Abstract

MOTIVATION

Batch effects in omics datasets are usually a source of technical noise that masks the biological signal and hampers data analysis. Batch effect removal has been widely addressed for individual omics technologies. However, multi-omic datasets may combine data obtained in different batches where omics type and batch are often confounded. Moreover, systematic biases may be introduced without notice during data acquisition, which creates a hidden batch effect. Current methods fail to address batch effect correction in these cases.

RESULTS

In this article, we introduce the MultiBaC R package, a tool for batch effect removal in multi-omics and hidden batch effect scenarios. The package includes a diversity of graphical outputs for model validation and assessment of the batch effect correction.

AVAILABILITY AND IMPLEMENTATION

MultiBaC package is available on Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/MultiBaC.html) and GitHub (https://github.com/ConesaLab/MultiBaC.git). The data underlying this article are available in Gene Expression Omnibus repository (accession numbers GSE11521, GSE1002, GSE56622 and GSE43747).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

组学数据中的批次效应通常是技术噪声的来源,会掩盖生物学信号并阻碍数据分析。已经广泛针对各个组学技术解决了批次效应去除问题。然而,多组学数据集可能结合了在不同批次中获得的数据,其中组学类型和批次通常是混淆的。此外,在数据采集过程中可能会引入系统性偏差,从而产生隐藏的批次效应。目前的方法无法解决这些情况下的批次效应校正问题。

结果

在本文中,我们介绍了 MultiBaC R 包,这是一种用于多组学和隐藏批次效应情况下批次效应去除的工具。该包包括各种图形输出,用于模型验证和批次效应校正评估。

可用性和实现

MultiBaC 包可在 Bioconductor(https://www.bioconductor.org/packages/release/bioc/html/MultiBaC.html)和 GitHub(https://github.com/ConesaLab/MultiBaC.git)上使用。本文所依据的数据可在基因表达综合数据库(Gene Expression Omnibus repository)中获取(访问号 GSE11521、GSE1002、GSE56622 和 GSE43747)。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dadf/9048667/82f7f5c91a02/btac132f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验