局部邻域归一化：在大规模代谢组学中协调精确归一化与异质性恢复

Local neighbor Normalization: Reconciling accurate normalization and heterogeneity recovery in large-scale metabolomics.

作者信息

Lu Keyi, Liu Yaru, Cheng Kian-Kai, Guo Fanjing, Deng Lingli, Dong Jiyang

机构信息

Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.

Faculty of Chemical and Energy Engineering, Universiti Teknologi Malaysia, Johor Bahru, Johor, 81310, Malaysia.

出版信息

Anal Chim Acta. 2025 Oct 22;1372:344440. doi: 10.1016/j.aca.2025.344440. Epub 2025 Jul 16.

DOI:10.1016/j.aca.2025.344440

PMID:40903117

Abstract

BACKGROUND

Metabolomics studies often grapple with the dilution effect, where sample concentrations vary due to inconsistent handling or biological diversity, particularly in samples like urine, saliva, or cell extracts. This variation can mask true metabolic differences, complicating data interpretation. Traditional normalization methods, such as Constant Sum Normalization (CSN), Probabilistic Quotient Normalization (PQN), and Maximal Density Fold Change (MDFC), assume that all samples share a certain invariant statistic and overlook data heterogeneity, potentially erasing the dataset's heterogeneity essential for distinguishing biological subgroups.

RESULTS

To address this, we introduce Local Neighbor Normalization (LNN), a novel approach that corrects for dilution effects while preserving the intrinsic variability of metabolomics data. LNN identifies a neighbor set for each sample based on similarity metrics and normalizes each sample against a tailored reference spectrum derived from these neighbors. Through comprehensive evaluations on both simulated and real metabolomics datasets from NMR, GC-MS, and LC-MS platforms, LNN demonstrated superior performance over CSN, PQN, and MDFC. Specifically, it achieved better elimination of dilution effects, recovery of inter-sample heterogeneity and inter-metabolite correlations, as evidenced by metrics such as the D-statistic and correlation recovery rates. Notably, LNN excels in datasets with over 50 % differential metabolites, safeguarding local data structures critically for downstream analyses like biomarker discovery.

SIGNIFICANCE AND NOVELTY

LNN constructs sample-specific reference spectra based on a local neighbor set. This approach ensures that normalization accounts for dilution effects without compromising local structure of the data, which is crucial for biological interpretation. Additionally, LNN demonstrates superior performance in recovering inter-sample heterogeneity and metabolite correlations, especially in datasets with high proportions of differential metabolites. This method's versatility, robustness against noise, and applicability across various metabolomics platforms make it a significant advancement in the field.

摘要

背景

代谢组学研究常常面临稀释效应的困扰，即由于处理方式不一致或生物多样性导致样本浓度变化，尤其是在尿液、唾液或细胞提取物等样本中。这种变化会掩盖真正的代谢差异，使数据解读变得复杂。传统的归一化方法，如恒和归一化（CSN）、概率商归一化（PQN）和最大密度倍数变化（MDFC），假定所有样本共享某个不变的统计量，而忽略了数据的异质性，这可能会消除对于区分生物亚组至关重要的数据集异质性。

结果

为解决这一问题，我们引入了局部邻域归一化（LNN），这是一种新颖的方法，可校正稀释效应，同时保留代谢组学数据的内在变异性。LNN基于相似性度量为每个样本识别一个邻域集，并根据从这些邻域导出的定制参考光谱对每个样本进行归一化。通过对来自核磁共振（NMR）、气相色谱 - 质谱联用（GC - MS）和液相色谱 - 质谱联用（LC - MS）平台的模拟和真实代谢组学数据集进行全面评估，LNN表现出优于CSN、PQN和MDFC的性能。具体而言，它在消除稀释效应、恢复样本间异质性和代谢物间相关性方面表现更佳，如D统计量和相关性恢复率等指标所示。值得注意的是，LNN在差异代谢物超过50%的数据集中表现出色，对于生物标志物发现等下游分析至关重要的局部数据结构起到了保护作用。

意义与创新

LNN基于局部邻域集构建样本特异性参考光谱。这种方法确保归一化在不损害数据局部结构的情况下考虑稀释效应，这对于生物学解释至关重要。此外，LNN在恢复样本间异质性和代谢物相关性方面表现出色，尤其是在差异代谢物比例高的数据集中。该方法的通用性、抗噪声鲁棒性以及在各种代谢组学平台上的适用性使其成为该领域的一项重大进展。

相似文献

Local neighbor Normalization: Reconciling accurate normalization and heterogeneity recovery in large-scale metabolomics.局部邻域归一化：在大规模代谢组学中协调精确归一化与异质性恢复

Anal Chim Acta. 2025 Oct 22;1372:344440. doi: 10.1016/j.aca.2025.344440. Epub 2025 Jul 16.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

MarkVCID cerebral small vessel consortium: I. Enrollment, clinical, fluid protocols.马克 VCID 脑小血管联盟：一、入组、临床、液体方案。

Alzheimers Dement. 2021 Apr;17(4):704-715. doi: 10.1002/alz.12215. Epub 2021 Jan 21.

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.样本采集部位和采集程序对严重急性呼吸综合征冠状病毒2（SARS-CoV-2）感染鉴定的影响。

Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.

Normalization strategies in neonatal steroid metabolomics: a comparative analysis of probabilistic quotient and peer group approaches.新生儿类固醇代谢组学中的归一化策略：概率商与同龄组方法的比较分析

Endokrynol Pol. 2025;76(3):331-338. doi: 10.5603/ep.106090.

Short-Term Memory Impairment短期记忆障碍

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理（2025年结石病专家共识）

Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.

Audit and feedback: effects on professional practice.审核与反馈：对专业实践的影响

Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.

局部邻域归一化：在大规模代谢组学中协调精确归一化与异质性恢复

Local neighbor Normalization: Reconciling accurate normalization and heterogeneity recovery in large-scale metabolomics.

作者信息

Lu Keyi, Liu Yaru, Cheng Kian-Kai, Guo Fanjing, Deng Lingli, Dong Jiyang

机构信息

Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, 361005, China.

Faculty of Chemical and Energy Engineering, Universiti Teknologi Malaysia, Johor Bahru, Johor, 81310, Malaysia.

出版信息

Anal Chim Acta. 2025 Oct 22;1372:344440. doi: 10.1016/j.aca.2025.344440. Epub 2025 Jul 16.

DOI:10.1016/j.aca.2025.344440

PMID:40903117

Abstract

BACKGROUND

RESULTS

SIGNIFICANCE AND NOVELTY

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

局部邻域归一化：在大规模代谢组学中协调精确归一化与异质性恢复

Local neighbor Normalization: Reconciling accurate normalization and heterogeneity recovery in large-scale metabolomics.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

SIGNIFICANCE AND NOVELTY

背景

结果

意义与创新

相似文献

局部邻域归一化：在大规模代谢组学中协调精确归一化与异质性恢复

Local neighbor Normalization: Reconciling accurate normalization and heterogeneity recovery in large-scale metabolomics.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

SIGNIFICANCE AND NOVELTY

背景

结果

意义与创新

相似文献