• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用启发式方法对大规模转录组数据进行标准化

Normalization of Large-Scale Transcriptome Data Using Heuristic Methods.

作者信息

Yosef Arthur, Shnaider Eli, Schneider Moti, Gurevich Michael

机构信息

Tel Aviv-Yaffo Academic College, Yaffo, Israel.

Netanya Academic College, Netanya, Israel.

出版信息

Bioinform Biol Insights. 2023 Mar 31;17:11779322231160397. doi: 10.1177/11779322231160397. eCollection 2023.

DOI:10.1177/11779322231160397
PMID:37020503
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10068970/
Abstract

In this study, we introduce an artificial intelligent method for addressing the batch effect of a transcriptome data. The method has several clear advantages in comparison with the alternative methods presently in use. Batch effect refers to the discrepancy in gene expression data series, measured under different conditions. While the data from the same batch (measurements performed under the same conditions) are compatible, combining various batches into 1 data set is problematic because of incompatible measurements. Therefore, it is necessary to perform correction of the combined data (normalization), before performing biological analysis. There are numerous methods attempting to correct data set for batch effect. These methods rely on various assumptions regarding the distribution of the measurements. Forcing the data elements into pre-supposed distribution can severely distort biological signals, thus leading to incorrect results and conclusions. As the discrepancy between the assumptions regarding the data distribution and the actual distribution is wider, the biases introduced by such "correction methods" are greater. We introduce a heuristic method to reduce batch effect. The method does not rely on any assumptions regarding the distribution and the behavior of data elements. Hence, it does not introduce any new biases in the process of correcting the batch effect. It strictly maintains the integrity of measurements within the original batches.

摘要

在本研究中,我们介绍了一种用于解决转录组数据批次效应的人工智能方法。与目前使用的其他方法相比,该方法具有几个明显的优势。批次效应是指在不同条件下测量的基因表达数据系列中的差异。虽然来自同一批次的数据(在相同条件下进行的测量)是兼容的,但由于测量不兼容,将不同批次的数据合并成一个数据集会存在问题。因此,在进行生物学分析之前,有必要对合并后的数据进行校正(归一化)。有许多方法试图校正数据集的批次效应。这些方法依赖于关于测量分布的各种假设。将数据元素强制纳入预先设定的分布可能会严重扭曲生物学信号,从而导致错误的结果和结论。随着关于数据分布的假设与实际分布之间的差异越大,这种“校正方法”引入的偏差就越大。我们引入了一种启发式方法来减少批次效应。该方法不依赖于关于数据元素分布和行为的任何假设。因此,它在校正批次效应的过程中不会引入任何新的偏差。它严格保持原始批次内测量的完整性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/e9b1c8276746/10.1177_11779322231160397-fig16.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/685b6b15e25a/10.1177_11779322231160397-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/c74cff8cea8f/10.1177_11779322231160397-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/c02f6a3c7210/10.1177_11779322231160397-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/ff222350f774/10.1177_11779322231160397-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/a6405b410c86/10.1177_11779322231160397-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/a70ce71b5c77/10.1177_11779322231160397-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/d1fceaa426f6/10.1177_11779322231160397-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/ca31add8b0b6/10.1177_11779322231160397-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/9d4b262bfbc1/10.1177_11779322231160397-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/8d454cd6ddf3/10.1177_11779322231160397-fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/cf4db300554b/10.1177_11779322231160397-fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/b16ab4c21144/10.1177_11779322231160397-fig12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/12197122c0ea/10.1177_11779322231160397-fig13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/622a9abae07f/10.1177_11779322231160397-fig14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/63530af908af/10.1177_11779322231160397-fig15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/e9b1c8276746/10.1177_11779322231160397-fig16.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/685b6b15e25a/10.1177_11779322231160397-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/c74cff8cea8f/10.1177_11779322231160397-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/c02f6a3c7210/10.1177_11779322231160397-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/ff222350f774/10.1177_11779322231160397-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/a6405b410c86/10.1177_11779322231160397-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/a70ce71b5c77/10.1177_11779322231160397-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/d1fceaa426f6/10.1177_11779322231160397-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/ca31add8b0b6/10.1177_11779322231160397-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/9d4b262bfbc1/10.1177_11779322231160397-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/8d454cd6ddf3/10.1177_11779322231160397-fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/cf4db300554b/10.1177_11779322231160397-fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/b16ab4c21144/10.1177_11779322231160397-fig12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/12197122c0ea/10.1177_11779322231160397-fig13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/622a9abae07f/10.1177_11779322231160397-fig14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/63530af908af/10.1177_11779322231160397-fig15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1e6/10068970/e9b1c8276746/10.1177_11779322231160397-fig16.jpg

相似文献

1
Normalization of Large-Scale Transcriptome Data Using Heuristic Methods.使用启发式方法对大规模转录组数据进行标准化
Bioinform Biol Insights. 2023 Mar 31;17:11779322231160397. doi: 10.1177/11779322231160397. eCollection 2023.
2
AMDBNorm: an approach based on distribution adjustment to eliminate batch effects of gene expression data.AMDBNorm:一种基于分布调整的方法,用于消除基因表达数据的批次效应。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab528.
3
Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data.去除纵向基因表达中的批次效应——分位数标准化加ComBat是微阵列转录组数据的最佳方法
PLoS One. 2016 Jun 7;11(6):e0156594. doi: 10.1371/journal.pone.0156594. eCollection 2016.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
Evaluation of intensity drift correction strategies using MetaboDrift, a normalization tool for multi-batch metabolomics data.使用MetaboDrift(一种用于多批次代谢组学数据的归一化工具)评估强度漂移校正策略。
J Chromatogr A. 2017 Nov 10;1523:265-274. doi: 10.1016/j.chroma.2017.09.023. Epub 2017 Sep 9.
6
Batch correction of microarray data substantially improves the identification of genes differentially expressed in rheumatoid arthritis and osteoarthritis.批量校正微阵列数据可显著提高类风湿关节炎和骨关节炎中差异表达基因的识别。
BMC Med Genomics. 2012 Jun 8;5:23. doi: 10.1186/1755-8794-5-23.
7
Sequencing dropout-and-batch effect normalization for single-cell mRNA profiles: a survey and comparative analysis.单细胞 mRNA 图谱测序中脱落和批次效应的归一化:综述和比较分析。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa248.
8
Combining location-and-scale batch effect adjustment with data cleaning by latent factor adjustment.通过潜在因素调整将位置和尺度批次效应调整与数据清理相结合。
BMC Bioinformatics. 2016 Jan 12;17:27. doi: 10.1186/s12859-015-0870-z.
9
Improved batch correction in untargeted MS-based metabolomics.非靶向质谱代谢组学中改进的批次校正
Metabolomics. 2016;12:88. doi: 10.1007/s11306-016-1015-8. Epub 2016 Mar 18.
10
Batch effect correction for genome-wide methylation data with Illumina Infinium platform.基于 Illumina Infinium 平台的全基因组甲基化数据的批次效应校正。
BMC Med Genomics. 2011 Dec 16;4:84. doi: 10.1186/1755-8794-4-84.

引用本文的文献

1
Adaptive individualized gene pair signatures distinguishing melanoma and predicting response to immune checkpoint blockade.区分黑色素瘤并预测免疫检查点阻断反应的适应性个体化基因对特征
iScience. 2025 Aug 8;28(9):113329. doi: 10.1016/j.isci.2025.113329. eCollection 2025 Sep 19.
2
Epitranscriptomic analysis reveals clinical and molecular signatures in glioblastoma.表观转录组学分析揭示了胶质母细胞瘤的临床和分子特征。
Acta Neuropathol Commun. 2025 Apr 11;13(1):74. doi: 10.1186/s40478-025-01966-5.

本文引用的文献

1
MUREN: a robust and multi-reference approach of RNA-seq transcript normalization.MUREN:一种稳健且支持多参照的 RNA-seq 转录本标准化方法。
BMC Bioinformatics. 2021 Jul 28;22(1):386. doi: 10.1186/s12859-021-04288-0.
2
Simulating ComBat: how batch correction can lead to the systematic introduction of false positive results in DNA methylation microarray studies.模拟 ComBat:批次校正如何导致 DNA 甲基化微阵列研究中系统地引入假阳性结果。
BMC Bioinformatics. 2020 Jun 30;21(1):271. doi: 10.1186/s12859-020-03559-6.
3
Normalization Methods for the Analysis of Unbalanced Transcriptome Data: A Review.
非平衡转录组数据分析的归一化方法综述
Front Bioeng Biotechnol. 2019 Nov 26;7:358. doi: 10.3389/fbioe.2019.00358. eCollection 2019.
4
Examining the practical limits of batch effect-correction algorithms: When should you care about batch effects?探究批量效应校正算法的实际限制:何时应关注批量效应?
J Genet Genomics. 2019 Sep 20;46(9):433-443. doi: 10.1016/j.jgg.2019.08.002.
5
Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes.利用经验定义的负调控基因对小脑和髓母细胞瘤基因表达数据集进行批量归一化处理。
Bioinformatics. 2019 Sep 15;35(18):3357-3364. doi: 10.1093/bioinformatics/btz066.
6
Alternative empirical Bayes models for adjusting for batch effects in genomic studies.用于调整基因组研究中批次效应的替代经验贝叶斯模型。
BMC Bioinformatics. 2018 Jul 13;19(1):262. doi: 10.1186/s12859-018-2263-6.
7
Adjusting for Batch Effects in DNA Methylation Microarray Data, a Lesson Learned.DNA甲基化微阵列数据中批次效应的校正:经验教训
Front Genet. 2018 Mar 16;9:83. doi: 10.3389/fgene.2018.00083. eCollection 2018.
8
Protein complex-based analysis is resistant to the obfuscating consequences of batch effects --- a case study in clinical proteomics.基于蛋白质复合物的分析可抵抗批次效应带来的混淆影响——临床蛋白质组学的一个案例研究
BMC Genomics. 2017 Mar 14;18(Suppl 2):142. doi: 10.1186/s12864-017-3490-3.
9
Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data.去除纵向基因表达中的批次效应——分位数标准化加ComBat是微阵列转录组数据的最佳方法
PLoS One. 2016 Jun 7;11(6):e0156594. doi: 10.1371/journal.pone.0156594. eCollection 2016.
10
CrossNorm: a novel normalization strategy for microarray data in cancers.CrossNorm:一种用于癌症微阵列数据的新型标准化策略。
Sci Rep. 2016 Jan 6;6:18898. doi: 10.1038/srep18898.