• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BatMan:通过分层缓解批次效应以进行生存结局预测。

BatMan: Mitigating Batch Effects Via Stratification for Survival Outcome Prediction.

机构信息

Division of Biostatistics, College of Public Health, Ohio State University, Columbus, OH.

Department of Population Health, New York University, New York, NY.

出版信息

JCO Clin Cancer Inform. 2023 Jun;7:e2200138. doi: 10.1200/CCI.22.00138.

DOI:10.1200/CCI.22.00138
PMID:37335961
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10530623/
Abstract

Reproducible translation of transcriptomics data has been hampered by the ubiquitous presence of batch effects. Statistical methods for managing batch effects were initially developed in the setting of sample group comparison and later borrowed for other settings such as survival outcome prediction. The most notable such method is ComBat, which adjusts for batches by including it as a covariate alongside sample groups in a linear regression. In survival prediction, however, ComBat is used without definable groups for survival outcome and is done sequentially with survival regression for a potentially batch-confounded outcome. To address these issues, we propose a new method called BATch MitigAtion via stratificatioN (BatMan). It adjusts batches as strata in survival regression and uses variable selection methods such as the regularized regression to handle high dimensionality. We assess the performance of BatMan in comparison with ComBat, each used either alone or in conjunction with data normalization, in a resampling-based simulation study under various levels of predictive signal strength and patterns of batch-outcome association. Our simulations show that (1) BatMan outperforms ComBat in nearly all scenarios when there are batch effects in the data and (2) their performance can be worsened by the addition of data normalization. We further evaluate them using microRNA data for ovarian cancer from the Cancer Genome Atlas and find that BatMan outforms ComBat while the addition of data normalization worsens the prediction. Our study thus shows the advantage of BatMan and raises caution about the use of data normalization in the context of developing survival prediction models. The BatMan method and the simulation tool for performance assessment are implemented in R and publicly available at LXQin/PRECISION.survival-GitHub.

摘要

转录组数据的可重现翻译一直受到批次效应普遍存在的阻碍。用于管理批次效应的统计方法最初是在样本组比较的背景下开发的,后来被借用到其他环境中,例如生存结果预测。最著名的方法是 ComBat,它通过将批次作为协变量与样本组一起包含在线性回归中,从而调整批次。然而,在生存预测中,ComBat 没有为生存结果定义可定义的组,并且与生存回归一起顺序进行,以避免潜在的批次混淆结果。为了解决这些问题,我们提出了一种名为通过分层(BatMan)进行批次缓解的新方法。它将批次调整为生存回归中的分层,并使用变量选择方法(如正则化回归)来处理高维数据。我们在基于重采样的模拟研究中评估了 BatMan 与 ComBat 的性能,每个方法都单独使用或与数据归一化一起使用,在各种预测信号强度和批次-结果关联模式下进行。我们的模拟表明:(1)当数据中存在批次效应时,BatMan 在几乎所有情况下都优于 ComBat;(2)添加数据归一化会使它们的性能恶化。我们进一步使用癌症基因组图谱(Cancer Genome Atlas)中来自卵巢癌的 microRNA 数据评估它们,并发现 BatMan 优于 ComBat,而添加数据归一化则会降低预测效果。因此,我们的研究表明了 BatMan 的优势,并对在开发生存预测模型的背景下使用数据归一化提出了警告。BatMan 方法和性能评估的模拟工具已在 R 中实现,并在 LXQin/PRECISION.survival-GitHub 上公开提供。

相似文献

1
BatMan: Mitigating Batch Effects Via Stratification for Survival Outcome Prediction.BatMan:通过分层缓解批次效应以进行生存结局预测。
JCO Clin Cancer Inform. 2023 Jun;7:e2200138. doi: 10.1200/CCI.22.00138.
2
Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data.去除纵向基因表达中的批次效应——分位数标准化加ComBat是微阵列转录组数据的最佳方法
PLoS One. 2016 Jun 7;11(6):e0156594. doi: 10.1371/journal.pone.0156594. eCollection 2016.
3
Removing batch effects from purified plasma cell gene expression microarrays with modified ComBat.使用改良的ComBat去除纯化浆细胞基因表达微阵列中的批次效应。
BMC Bioinformatics. 2015 Feb 25;16:63. doi: 10.1186/s12859-015-0478-3.
4
Batch normalization followed by merging is powerful for phenotype prediction integrating multiple heterogeneous studies.批量归一化后再合并对于整合多个异质研究的表型预测非常有效。
PLoS Comput Biol. 2023 Oct 16;19(10):e1010608. doi: 10.1371/journal.pcbi.1010608. eCollection 2023 Oct.
5
Performance evaluation of transcriptomics data normalization for survival risk prediction.转录组数据归一化用于生存风险预测的性能评估。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab257.
6
The practical effect of batch on genomic prediction.批次对基因组预测的实际影响。
Stat Appl Genet Mol Biol. 2012;11(3):Article 10. doi: 10.1515/1544-6115.1766.
7
On data normalization and batch-effect correction for tumor subtyping with microRNA data.关于使用微小RNA数据进行肿瘤亚型分类的数据标准化和批次效应校正
NAR Genom Bioinform. 2023 Jan 10;5(1):lqac100. doi: 10.1093/nargab/lqac100. eCollection 2023 Mar.
8
A comparative study on the unified model based multifactor dimensionality reduction methods for identifying gene-gene interactions associated with the survival phenotype.基于统一模型的多因素降维方法识别与生存表型相关的基因-基因相互作用的比较研究。
BioData Min. 2021 Mar 1;14(1):17. doi: 10.1186/s13040-021-00248-9.
9
Batch effect reduction of microarray data with dependent samples using an empirical Bayes approach (BRIDGE).基于经验贝叶斯方法(BRIDGE)减少具有相关性样本的微阵列数据的批次效应。
Stat Appl Genet Mol Biol. 2021 Dec 14;20(4-6):101-119. doi: 10.1515/sagmb-2021-0020.
10
Comparison of statistical methods and the use of quality control samples for batch effect correction in human transcriptome data.比较统计方法和使用质量控制样本对人类转录组数据中的批次效应进行校正。
PLoS One. 2018 Aug 30;13(8):e0202947. doi: 10.1371/journal.pone.0202947. eCollection 2018.

本文引用的文献

1
Overcoming the impacts of two-step batch effect correction on gene expression estimation and inference.克服两步批处理效应校正对基因表达估计和推断的影响。
Biostatistics. 2023 Jul 14;24(3):635-652. doi: 10.1093/biostatistics/kxab039.
2
Performance evaluation of transcriptomics data normalization for survival risk prediction.转录组数据归一化用于生存风险预测的性能评估。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab257.
3
Modeling drug response using network-based personalized treatment prediction (NetPTP) with applications to inflammatory bowel disease.使用基于网络的个性化治疗预测(NetPTP)进行药物反应建模及其在炎症性肠病中的应用。
PLoS Comput Biol. 2021 Feb 5;17(2):e1008631. doi: 10.1371/journal.pcbi.1008631. eCollection 2021 Feb.
4
A CD8 NK cell transcriptomic signature associated with clinical outcome in relapsing remitting multiple sclerosis.与复发缓解型多发性硬化临床结局相关的 CD8 NK 细胞转录组特征。
Nat Commun. 2021 Jan 27;12(1):635. doi: 10.1038/s41467-020-20594-2.
5
The impact of different sources of heterogeneity on loss of accuracy from genomic prediction models.不同来源的异质性对基因组预测模型准确性损失的影响。
Biostatistics. 2020 Apr 1;21(2):253-268. doi: 10.1093/biostatistics/kxy044.
6
A pair of datasets for microRNA expression profiling to examine the use of careful study design for assigning arrays to samples.用于 miRNA 表达谱分析的一对数据集,用于检查使用仔细的研究设计将阵列分配给样本。
Sci Data. 2018 May 15;5:180084. doi: 10.1038/sdata.2018.84.
7
The Cancer Genome Atlas: Creating Lasting Value beyond Its Data.癌症基因组图谱:在其数据之外创造持久价值。
Cell. 2018 Apr 5;173(2):283-285. doi: 10.1016/j.cell.2018.03.042.
8
Issues with data and analyses: Errors, underlying themes, and potential solutions.数据和分析问题:错误、潜在主题和潜在解决方案。
Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2563-2570. doi: 10.1073/pnas.1708279115.
9
Pooled Clustering of High-Grade Serous Ovarian Cancer Gene Expression Leads to Novel Consensus Subtypes Associated with Survival and Surgical Outcomes.高级别浆液性卵巢癌基因表达的合并聚类导致与生存和手术结果相关的新型共识亚型。
Clin Cancer Res. 2017 Aug 1;23(15):4077-4085. doi: 10.1158/1078-0432.CCR-17-0246. Epub 2017 Mar 9.
10
Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies.用于多位点全基因组关联研究的迭代确定独立筛选EM-贝叶斯套索算法
PLoS Comput Biol. 2017 Jan 31;13(1):e1005357. doi: 10.1371/journal.pcbi.1005357. eCollection 2017 Jan.