• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用固定替代变量分析消除预测问题中的批次效应。

Removing batch effects for prediction problems with frozen surrogate variable analysis.

作者信息

Parker Hilary S, Corrada Bravo Héctor, Leek Jeffrey T

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health , Baltimore, MD , USA.

Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland , College Park, MD , USA.

出版信息

PeerJ. 2014 Sep 23;2:e561. doi: 10.7717/peerj.561. eCollection 2014.

DOI:10.7717/peerj.561
PMID:25332844
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4179553/
Abstract

Batch effects are responsible for the failure of promising genomic prognostic signatures, major ambiguities in published genomic results, and retractions of widely-publicized findings. Batch effect corrections have been developed to remove these artifacts, but they are designed to be used in population studies. But genomic technologies are beginning to be used in clinical applications where samples are analyzed one at a time for diagnostic, prognostic, and predictive applications. There are currently no batch correction methods that have been developed specifically for prediction. In this paper, we propose an new method called frozen surrogate variable analysis (fSVA) that borrows strength from a training set for individual sample batch correction. We show that fSVA improves prediction accuracy in simulations and in public genomic studies. fSVA is available as part of the sva Bioconductor package.

摘要

批次效应导致了有前景的基因组预后特征失效、已发表基因组结果中的重大歧义以及广泛宣传的研究结果被撤回。已经开发了批次效应校正方法来消除这些假象,但它们是设计用于人群研究的。然而,基因组技术正开始用于临床应用,在这些应用中,样本是一次一个地进行分析以用于诊断、预后和预测。目前还没有专门为预测开发的批次校正方法。在本文中,我们提出了一种名为冻结替代变量分析(fSVA)的新方法,该方法从训练集中汲取力量以进行单个样本的批次校正。我们表明,fSVA在模拟和公共基因组研究中提高了预测准确性。fSVA作为sva Bioconductor软件包的一部分可用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c7e/4179553/9bc8002d20e2/peerj-02-561-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c7e/4179553/9bc8002d20e2/peerj-02-561-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c7e/4179553/9bc8002d20e2/peerj-02-561-g001.jpg

相似文献

1
Removing batch effects for prediction problems with frozen surrogate variable analysis.使用固定替代变量分析消除预测问题中的批次效应。
PeerJ. 2014 Sep 23;2:e561. doi: 10.7717/peerj.561. eCollection 2014.
2
The sva package for removing batch effects and other unwanted variation in high-throughput experiments.sva 包用于去除高通量实验中的批次效应和其他不需要的变异。
Bioinformatics. 2012 Mar 15;28(6):882-3. doi: 10.1093/bioinformatics/bts034. Epub 2012 Jan 17.
3
Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction.通过置换替代变量分析进行基因组批次校正以保留生物异质性。
Bioinformatics. 2014 Oct;30(19):2757-63. doi: 10.1093/bioinformatics/btu375. Epub 2014 Jun 6.
4
svaseq: removing batch effects and other unwanted noise from sequencing data.svaseq:去除测序数据中的批次效应和其他不必要的噪声。
Nucleic Acids Res. 2014 Dec 1;42(21):e161. doi: 10.1093/nar/gku864. Epub 2014 Oct 7.
5
Blind estimation and correction of microarray batch effect.盲估计和校正微阵列批次效应。
PLoS One. 2020 Apr 9;15(4):e0231446. doi: 10.1371/journal.pone.0231446. eCollection 2020.
6
The practical effect of batch on genomic prediction.批次对基因组预测的实际影响。
Stat Appl Genet Mol Biol. 2012;11(3):Article 10. doi: 10.1515/1544-6115.1766.
7
Overcoming the impacts of two-step batch effect correction on gene expression estimation and inference.克服两步批处理效应校正对基因表达估计和推断的影响。
Biostatistics. 2023 Jul 14;24(3):635-652. doi: 10.1093/biostatistics/kxab039.
8
Robustifying genomic classifiers to batch effects via ensemble learning.通过集成学习使基因组分类器稳健化以应对批次效应。
Bioinformatics. 2021 Jul 12;37(11):1521-1527. doi: 10.1093/bioinformatics/btaa986.
9
Practical impacts of genomic data "cleaning" on biological discovery using surrogate variable analysis.基因组数据“清理”对使用替代变量分析的生物学发现的实际影响。
BMC Bioinformatics. 2015 Nov 6;16:372. doi: 10.1186/s12859-015-0808-5.
10
SVAw - a web-based application tool for automated surrogate variable analysis of gene expression studies.SVAw - 一种用于基因表达研究自动替代变量分析的基于网络的应用工具。
Source Code Biol Med. 2013 Mar 11;8(1):8. doi: 10.1186/1751-0473-8-8.

引用本文的文献

1
Discovery of novel diagnostic biomarkers of hepatocellular carcinoma associated with immune infiltration.与免疫浸润相关的肝细胞癌新型诊断生物标志物的发现
Ann Med. 2025 Dec;57(1):2503645. doi: 10.1080/07853890.2025.2503645. Epub 2025 May 29.
2
Mutant CEBPA promotes tolerance to inflammatory stress through deficient AP-1 activation.突变型CEBPA通过AP-1激活缺陷促进对炎症应激的耐受性。
Nat Commun. 2025 Apr 12;16(1):3492. doi: 10.1038/s41467-025-58712-7.
3
Clear-Cell Renal Cell Carcinoma Molecular Subtypes Differ by African and European Genetic Similarity.

本文引用的文献

1
Remarks on Parallel Analysis.关于平行分析的评论
Multivariate Behav Res. 1992 Oct 1;27(4):509-40. doi: 10.1207/s15327906mbr2704_2.
2
Increasing consistency of disease biomarker prediction across datasets.提高疾病生物标志物预测在不同数据集间的一致性。
PLoS One. 2014 Apr 16;9(4):e91272. doi: 10.1371/journal.pone.0091272. eCollection 2014.
3
The practical effect of batch on genomic prediction.批次对基因组预测的实际影响。
透明细胞肾细胞癌分子亚型因非洲和欧洲遗传相似性而异。
Cancer Res Commun. 2025 May 1;5(5):743-755. doi: 10.1158/2767-9764.CRC-24-0624.
4
MultiOmicsIntegrator: a nextflow pipeline for integrated omics analyses.MultiOmicsIntegrator:用于综合组学分析的Nextflow工作流程。
Bioinform Adv. 2024 Nov 14;4(1):vbae175. doi: 10.1093/bioadv/vbae175. eCollection 2024.
5
Developing a DNA Methylation Signature to Differentiate High-Grade Serous Ovarian Carcinomas from Benign Ovarian Tumors.开发一种 DNA 甲基化特征,以区分高级别浆液性卵巢癌与良性卵巢肿瘤。
Mol Diagn Ther. 2024 Nov;28(6):821-834. doi: 10.1007/s40291-024-00740-y. Epub 2024 Oct 16.
6
Thinking points for effective batch correction on biomedical data.生物医学数据有效批量校正的思考要点。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae515.
7
A high-resolution map of functional miR-181 response elements in the thymus reveals the role of coding sequence targeting and an alternative seed match.高分辨率的功能性 miR-181 反应元件图谱在胸腺中揭示了编码序列靶向和替代种子匹配的作用。
Nucleic Acids Res. 2024 Aug 12;52(14):8515-8533. doi: 10.1093/nar/gkae416.
8
Antileukemic potential of methylated indolequinone MAC681 through immunogenic necroptosis and PARP1 degradation.甲基化吲哚醌MAC681通过免疫原性坏死性凋亡和PARP1降解发挥的抗白血病潜力。
Biomark Res. 2024 May 4;12(1):47. doi: 10.1186/s40364-024-00594-w.
9
Differential gene expression patterns in ST-elevation Myocardial Infarction and Non-ST-elevation Myocardial Infarction.ST 段抬高型心肌梗死与非 ST 段抬高型心肌梗死的差异基因表达模式。
Sci Rep. 2024 Feb 10;14(1):3424. doi: 10.1038/s41598-024-54086-w.
10
Development of an genotoxicity assay to detect retroviral vector-induced lymphoid insertional mutants.开发一种用于检测逆转录病毒载体诱导的淋巴样插入突变体的遗传毒性检测方法。
Mol Ther Methods Clin Dev. 2023 Aug 22;30:515-533. doi: 10.1016/j.omtm.2023.08.017. eCollection 2023 Sep 14.
Stat Appl Genet Mol Biol. 2012;11(3):Article 10. doi: 10.1515/1544-6115.1766.
4
Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies.在表观遗传学流行病学研究中,通过寻找差异甲基化区域来进行颠簸狩猎。
Int J Epidemiol. 2012 Feb;41(1):200-9. doi: 10.1093/ije/dyr238.
5
Learning from our GWAS mistakes: from experimental design to scientific method.从 GWAS 错误中学习:从实验设计到科学方法。
Biostatistics. 2012 Apr;13(2):195-203. doi: 10.1093/biostatistics/kxr055. Epub 2012 Jan 27.
6
The sva package for removing batch effects and other unwanted variation in high-throughput experiments.sva 包用于去除高通量实验中的批次效应和其他不需要的变异。
Bioinformatics. 2012 Mar 15;28(6):882-3. doi: 10.1093/bioinformatics/bts034. Epub 2012 Jan 17.
7
Using control genes to correct for unwanted variation in microarray data.利用对照基因纠正微阵列数据中的非期望变异。
Biostatistics. 2012 Jul;13(3):539-52. doi: 10.1093/biostatistics/kxr034. Epub 2011 Nov 17.
8
Retraction.撤回。
Science. 2011 Jul 22;333(6041):404. doi: 10.1126/science.333.6041.404-a.
9
Personalized medicine: progress and promise.个性化医学:进展与前景。
Annu Rev Genomics Hum Genet. 2011;12:217-44. doi: 10.1146/annurev-genom-082410-101446.
10
Tackling the widespread and critical impact of batch effects in high-throughput data.解决高通量数据中广泛存在且极具影响力的批次效应问题。
Nat Rev Genet. 2010 Oct;11(10):733-9. doi: 10.1038/nrg2825. Epub 2010 Sep 14.