Baker Heart and Diabetes Institute, Melbourne, VIC, 3004, Australia.
Baker Department of Cardiovascular Research, Translation and Implementation, La Trobe University, Melbourne, VIC, 3086, Australia.
Nat Commun. 2024 Feb 20;15(1):1540. doi: 10.1038/s41467-024-45838-3.
Recent advancements in plasma lipidomic profiling methodology have significantly increased specificity and accuracy of lipid measurements. This evolution, driven by improved chromatographic and mass spectrometric resolution of newer platforms, has made it challenging to align datasets created at different times, or on different platforms. Here we present a framework for harmonising such plasma lipidomic datasets with different levels of granularity in their lipid measurements. Our method utilises elastic-net prediction models, constructed from high-resolution lipidomics reference datasets, to predict unmeasured lipid species in lower-resolution studies. The approach involves (1) constructing composite lipid measures in the reference dataset that map to less resolved lipids in the target dataset, (2) addressing discrepancies between aligned lipid species, (3) generating prediction models, (4) assessing their transferability into the targe dataset, and (5) evaluating their prediction accuracy. To demonstrate our approach, we used the AusDiab population-based cohort (747 lipid species) as the reference to impute unmeasured lipid species into the LIPID study (342 lipid species). Furthermore, we compared measured and imputed lipids in terms of parameter estimation and predictive performance, and validated imputations in an independent study. Our method for harmonising plasma lipidomic datasets will facilitate model validation and data integration efforts.
近年来,血浆脂质组学分析方法的进展显著提高了脂质测量的特异性和准确性。这种演变是由新平台的色谱和质谱分辨率的提高所驱动的,这使得在不同时间或不同平台上创建的数据集难以对齐。在这里,我们提出了一个框架,用于协调具有不同脂质测量粒度的血浆脂质组学数据集。我们的方法利用从高分辨率脂质组学参考数据集构建的弹性网络预测模型,来预测低分辨率研究中未测量的脂质种类。该方法包括:(1) 在参考数据集构建复合脂质测量值,以映射到目标数据集分辨率较低的脂质;(2) 解决对齐脂质种类之间的差异;(3) 生成预测模型;(4) 评估其在目标数据集的可转移性;(5) 评估其预测准确性。为了演示我们的方法,我们使用基于人群的澳大利亚糖尿病研究(AusDiab)队列(747 种脂质)作为参考,将未测量的脂质种类内插到脂质研究(LIPID)中(342 种脂质)。此外,我们还比较了测量和内插脂质在参数估计和预测性能方面的差异,并在一项独立研究中验证了内插结果。我们协调血浆脂质组学数据集的方法将有助于模型验证和数据集成工作。