Suppr超能文献

关于使用微小RNA数据进行肿瘤亚型分类的数据标准化和批次效应校正

On data normalization and batch-effect correction for tumor subtyping with microRNA data.

作者信息

Wu Yilin, Yuen Becky Wing-Yan, Wei Yingying, Qin Li-Xuan

机构信息

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.

Department of Statistics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, SAR, China.

出版信息

NAR Genom Bioinform. 2023 Jan 10;5(1):lqac100. doi: 10.1093/nargab/lqac100. eCollection 2023 Mar.

Abstract

The discovery of new tumor subtypes has been aided by transcriptomics profiling. However, some new subtypes can be irreproducible due to data artifacts that arise from disparate experimental handling. To deal with these artifacts, methods for data normalization and batch-effect correction have been utilized before performing sample clustering for disease subtyping, despite that these methods were primarily developed for group comparison. It remains to be elucidated whether they are effective for sample clustering. We examined this issue with a re-sampling-based simulation study that leverages a pair of microRNA microarray data sets. Our study showed that (i) normalization generally benefited the discovery of sample clusters and quantile normalization tended to be the best performer, (ii) batch-effect correction was harmful when data artifacts confounded with biological signals, and (iii) their performance can be influenced by the choice of clustering method with the Prediction Around Medoid method based on Pearson correlation being consistently a best performer. Our study provides important insights on the use of data normalization and batch-effect correction in connection with the design of array-to-sample assignment and the choice of clustering method for facilitating accurate and reproducible discovery of tumor subtypes with microRNAs.

摘要

转录组分析有助于发现新的肿瘤亚型。然而,由于不同实验操作产生的数据假象,一些新亚型可能无法重复。为了处理这些假象,在进行疾病亚型样本聚类之前,已经采用了数据归一化和批次效应校正方法,尽管这些方法主要是为组间比较而开发的。它们对样本聚类是否有效仍有待阐明。我们利用一对 microRNA 微阵列数据集,通过基于重采样的模拟研究来探讨这个问题。我们的研究表明:(i)归一化通常有利于样本聚类的发现,分位数归一化往往表现最佳;(ii)当数据假象与生物信号混淆时,批次效应校正有害;(iii)它们的性能会受到聚类方法选择的影响,基于皮尔逊相关性的围绕中位数预测方法始终是最佳性能者。我们的研究为结合阵列到样本分配设计和聚类方法选择使用数据归一化和批次效应校正提供了重要见解,以促进利用 microRNA 准确且可重复地发现肿瘤亚型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c8cd/9830544/79729188a674/lqac100fig1.jpg

相似文献

1
On data normalization and batch-effect correction for tumor subtyping with microRNA data.
NAR Genom Bioinform. 2023 Jan 10;5(1):lqac100. doi: 10.1093/nargab/lqac100. eCollection 2023 Mar.
2
The Black Book of Psychotropic Dosing and Monitoring.
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
5
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.
6
Management of urinary stones by experts in stone disease (ESD 2025).
Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.
8
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
9
"In a State of Flow": A Qualitative Examination of Autistic Adults' Phenomenological Experiences of Task Immersion.
Autism Adulthood. 2024 Sep 16;6(3):362-373. doi: 10.1089/aut.2023.0032. eCollection 2024 Sep.

本文引用的文献

1
Making External Validation Valid for Molecular Classifier Development.
JCO Precis Oncol. 2021 Aug 5;5. doi: 10.1200/PO.21.00103. eCollection 2021 Aug.
2
Performance evaluation of transcriptomics data normalization for survival risk prediction.
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab257.
4
Molecular profiling for precision cancer therapies.
Genome Med. 2020 Jan 14;12(1):8. doi: 10.1186/s13073-019-0703-1.
5
PAM50 Molecular Intrinsic Subtypes in the Nurses' Health Study Cohorts.
Cancer Epidemiol Biomarkers Prev. 2019 Apr;28(4):798-806. doi: 10.1158/1055-9965.EPI-18-0863. Epub 2018 Dec 27.
7
Empirical evaluation of data normalization methods for molecular classification.
PeerJ. 2018 Apr 11;6:e4584. doi: 10.7717/peerj.4584. eCollection 2018.
9
A systematic evaluation of normalization methods in quantitative label-free proteomics.
Brief Bioinform. 2018 Jan 1;19(1):1-11. doi: 10.1093/bib/bbw095.
10
Cautionary Note on Using Cross-Validation for Molecular Classification.
J Clin Oncol. 2016 Nov 10;34(32):3931-3938. doi: 10.1200/JCO.2016.68.1031.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验