• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

微生物组数据中批次效应校正的复合分位数回归方法

Composite quantile regression approach to batch effect correction in microbiome data.

作者信息

Park Jiwon, Park Taesung

机构信息

Interdisciplinary Program of Bioinformatics, Seoul National University, Seoul, Republic of Korea.

Department of Statistics, Seoul National University, Seoul, Republic of Korea.

出版信息

Front Microbiol. 2025 Feb 25;16:1484183. doi: 10.3389/fmicb.2025.1484183. eCollection 2025.

DOI:10.3389/fmicb.2025.1484183
PMID:40071205
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11893821/
Abstract

BACKGROUND

Batch effects refer to data variations that arise from non-biological factors such as experimental conditions, equipment, and external factors. These effects are considered significant issues in the analysis of biological data since they can compromise data consistency and distort actual biological differences, which can severely skew the results of downstream analyses.

METHOD

In this study, we introduce a new approach that comprehensively addresses two types of batch effects: "systematic batch effects" which are consistent across all samples in a batch, and "nonsystematic batch effects" which vary depending on the variability of operational taxonomic units (OTUs) within each sample in the same batch. To address systematic batch effects, we apply a negative binomial regression model and correct for consistent batch influences by excluding fixed batch effects. Additionally, to handle nonsystematic batch effects, we employ composite quantile regression. By adjusting the distribution of OTUs to be similar based on a reference batch selected using the Kruskal-Walis test method, we consider the variability at the OTU level.

RESULTS

The performance of the model is evaluated and compared with existing methods using PERMANOVA R-squared values, Principal Coordinates Analysis (PCoA) plots and Average Silhouette Coefficient calculated with diverse distance-based metrics. The model is applied to three real microbiome datasets: Metagenomic urine control data, Human Immunodeficiency Virus Re-analysis Consortium data, and Men and Women Offering Understanding of Throat HPV study data. The results demonstrate that the model effectively corrects for batch effects across all datasets.

摘要

背景

批次效应是指由实验条件、设备和外部因素等非生物因素引起的数据变化。这些效应在生物数据分析中被视为重大问题,因为它们会损害数据的一致性并扭曲实际的生物学差异,从而严重歪曲下游分析的结果。

方法

在本研究中,我们引入了一种新方法,该方法全面解决了两种类型的批次效应:“系统性批次效应”,即在一批中的所有样本中都是一致的;以及“非系统性批次效应”,其取决于同一批次中每个样本内可操作分类单元(OTU)的变异性。为了解决系统性批次效应,我们应用负二项式回归模型,并通过排除固定的批次效应来校正一致的批次影响。此外,为了处理非系统性批次效应,我们采用复合分位数回归。通过基于使用Kruskal-Walis检验方法选择的参考批次调整OTU的分布以使其相似,我们考虑了OTU水平的变异性。

结果

使用PERMANOVA R平方值、主坐标分析(PCoA)图以及用不同基于距离的指标计算的平均轮廓系数来评估该模型的性能并与现有方法进行比较。该模型应用于三个真实的微生物组数据集:宏基因组尿液对照数据、人类免疫缺陷病毒重新分析联盟数据以及提供对咽喉HPV理解的男性和女性研究数据。结果表明该模型有效地校正了所有数据集中的批次效应。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/4953738aac2a/fmicb-16-1484183-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/42d61ebfdeb0/fmicb-16-1484183-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/c651c048a1ca/fmicb-16-1484183-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/74cff676b056/fmicb-16-1484183-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/a8a799a1daca/fmicb-16-1484183-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/4953738aac2a/fmicb-16-1484183-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/42d61ebfdeb0/fmicb-16-1484183-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/c651c048a1ca/fmicb-16-1484183-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/74cff676b056/fmicb-16-1484183-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/a8a799a1daca/fmicb-16-1484183-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad40/11893821/4953738aac2a/fmicb-16-1484183-g0005.jpg

相似文献

1
Composite quantile regression approach to batch effect correction in microbiome data.微生物组数据中批次效应校正的复合分位数回归方法
Front Microbiol. 2025 Feb 25;16:1484183. doi: 10.3389/fmicb.2025.1484183. eCollection 2025.
2
Batch effects removal for microbiome data via conditional quantile regression.通过条件分位数回归去除微生物组数据的批次效应。
Nat Commun. 2022 Sep 15;13(1):5418. doi: 10.1038/s41467-022-33071-9.
3
Highly effective batch effect correction method for RNA-seq count data.用于RNA测序计数数据的高效批次效应校正方法。
Comput Struct Biotechnol J. 2024 Dec 16;27:58-64. doi: 10.1016/j.csbj.2024.12.010. eCollection 2025.
4
Highly Effective Batch Effect Correction Method for RNA-seq Count Data.用于RNA测序计数数据的高效批次效应校正方法
bioRxiv. 2024 May 5:2024.05.02.592266. doi: 10.1101/2024.05.02.592266.
5
Powerful and robust non-parametric association testing for microbiome data via a zero-inflated quantile approach (ZINQ).基于零膨胀分位数方法(ZINQ)的微生物组数据强大而稳健的非参数关联检验。
Microbiome. 2021 Sep 2;9(1):181. doi: 10.1186/s40168-021-01129-3.
6
ZINQ-L: a zero-inflated quantile approach for differential abundance analysis of longitudinal microbiome data.ZINQ-L:一种用于纵向微生物组数据差异丰度分析的零膨胀分位数方法。
Front Genet. 2025 Jan 29;15:1494401. doi: 10.3389/fgene.2024.1494401. eCollection 2024.
7
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
8
PLSDA-batch: a multivariate framework to correct for batch effects in microbiome data.PLSDA-batch:一种用于校正微生物组数据中批次效应的多元框架。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbac622.
9
Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data.去除纵向基因表达中的批次效应——分位数标准化加ComBat是微阵列转录组数据的最佳方法
PLoS One. 2016 Jun 7;11(6):e0156594. doi: 10.1371/journal.pone.0156594. eCollection 2016.
10
Predictive analysis methods for human microbiome data with application to Parkinson's disease.基于人类微生物组数据的预测分析方法及其在帕金森病中的应用。
PLoS One. 2020 Aug 24;15(8):e0237779. doi: 10.1371/journal.pone.0237779. eCollection 2020.

本文引用的文献

1
Smoothed Quantile Regression with Large-Scale Inference.具有大规模推断的平滑分位数回归
J Econom. 2023 Feb;232(2):367-388. doi: 10.1016/j.jeconom.2021.07.010. Epub 2021 Aug 24.
2
Population structure discovery in meta-analyzed microbial communities and inflammatory bowel disease using MMUPHin.使用 MMUPHin 发现元分析微生物群落和炎症性肠病中的种群结构。
Genome Biol. 2022 Oct 3;23(1):208. doi: 10.1186/s13059-022-02753-4.
3
Batch effects removal for microbiome data via conditional quantile regression.通过条件分位数回归去除微生物组数据的批次效应。
Nat Commun. 2022 Sep 15;13(1):5418. doi: 10.1038/s41467-022-33071-9.
4
Oral Human Papillomavirus Associated With Differences in Oral Microbiota Beta Diversity and Microbiota Abundance.口腔人乳头瘤病毒与口腔微生物多样性和微生物丰度的差异相关。
J Infect Dis. 2022 Sep 21;226(6):1098-1108. doi: 10.1093/infdis/jiac010.
5
Overcoming the impacts of two-step batch effect correction on gene expression estimation and inference.克服两步批处理效应校正对基因表达估计和推断的影响。
Biostatistics. 2023 Jul 14;24(3):635-652. doi: 10.1093/biostatistics/kxab039.
6
Powerful and robust non-parametric association testing for microbiome data via a zero-inflated quantile approach (ZINQ).基于零膨胀分位数方法(ZINQ)的微生物组数据强大而稳健的非参数关联检验。
Microbiome. 2021 Sep 2;9(1):181. doi: 10.1186/s40168-021-01129-3.
7
Effect of Amplicon Sequencing Depth in Environmental Microbiome Research.扩增子测序深度对环境微生物组研究的影响。
Curr Microbiol. 2021 Mar;78(3):1026-1033. doi: 10.1007/s00284-021-02345-8. Epub 2021 Feb 3.
8
NBZIMM: negative binomial and zero-inflated mixed models, with application to microbiome/metagenomics data analysis.NBZIMM:负二项式和零膨胀混合模型,应用于微生物组/宏基因组数据分析。
BMC Bioinformatics. 2020 Oct 30;21(1):488. doi: 10.1186/s12859-020-03803-z.
9
Negative binomial mixed models for analyzing longitudinal CD4 count data.用于分析纵向 CD4 计数数据的负二项混合模型。
Sci Rep. 2020 Oct 7;10(1):16742. doi: 10.1038/s41598-020-73883-7.
10
: batch effect adjustment for RNA-seq count data.RNA测序计数数据的批次效应调整
NAR Genom Bioinform. 2020 Sep;2(3):lqaa078. doi: 10.1093/nargab/lqaa078. Epub 2020 Sep 21.