Bioinformatics Research Center, Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27606, USA.
Immunity, Inflammation, and Disease Laboratory, Division of Intramural Research, National Institute of Environmental Health Sciences, Durham, NC, 27709, USA.
Anal Bioanal Chem. 2024 Apr;416(9):2189-2202. doi: 10.1007/s00216-023-04991-2. Epub 2023 Oct 25.
The goal of lipidomic studies is to provide a broad characterization of cellular lipids present and changing in a sample of interest. Recent lipidomic research has significantly contributed to revealing the multifaceted roles that lipids play in fundamental cellular processes, including signaling, energy storage, and structural support. Furthermore, these findings have shed light on how lipids dynamically respond to various perturbations. Continued advancement in analytical techniques has also led to improved abilities to detect and identify novel lipid species, resulting in increasingly large datasets. Statistical analysis of these datasets can be challenging not only because of their vast size, but also because of the highly correlated data structure that exists due to many lipids belonging to the same metabolic or regulatory pathways. Interpretation of these lipidomic datasets is also hindered by a lack of current biological knowledge for the individual lipids. These limitations can therefore make lipidomic data analysis a daunting task. To address these difficulties and shed light on opportunities and also weaknesses in current tools, we have assembled this review. Here, we illustrate common statistical approaches for finding patterns in lipidomic datasets, including univariate hypothesis testing, unsupervised clustering, supervised classification modeling, and deep learning approaches. We then describe various bioinformatic tools often used to biologically contextualize results of interest. Overall, this review provides a framework for guiding lipidomic data analysis to promote a greater assessment of lipidomic results, while understanding potential advantages and weaknesses along the way.
脂质组学研究的目标是对感兴趣的样本中存在和变化的细胞脂质进行广泛的描述。最近的脂质组学研究极大地揭示了脂质在包括信号转导、能量储存和结构支持在内的基本细胞过程中所扮演的多面角色。此外,这些发现还阐明了脂质如何对各种干扰因素做出动态响应。分析技术的不断进步也提高了检测和识别新型脂质的能力,从而产生了越来越大的数据集。对这些数据集进行统计分析不仅具有挑战性,不仅因为其规模庞大,还因为由于许多脂质属于相同的代谢或调节途径,所以存在高度相关的数据结构。由于个体脂质的当前生物学知识的缺乏,对这些脂质组学数据集的解释也受到阻碍。因此,这些限制使得脂质组学数据分析成为一项艰巨的任务。为了解决这些困难,并阐明当前工具的机遇和弱点,我们汇编了这篇综述。在这里,我们展示了在脂质组学数据集中寻找模式的常见统计方法,包括单变量假设检验、无监督聚类、监督分类建模和深度学习方法。然后,我们描述了通常用于对感兴趣的结果进行生物学背景分析的各种生物信息学工具。总的来说,这篇综述为指导脂质组学数据分析提供了一个框架,以促进对脂质组学结果的更全面评估,同时了解潜在的优点和弱点。