Jaakkola Maria K, Kukkonen-Macchi Anu, Suomi Tomi, Elo Laura L
Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland.
Department of Mathematics and Statistics, University of Turku, Turku, Finland.
Sci Rep. 2025 May 2;15(1):15393. doi: 10.1038/s41598-025-98492-0.
Pathway analysis is a frequent step in studies involving gene or protein expression data, but most of the available pathway methods are designed for simple case versus control studies of two sample groups without further complexity. The few available methods allowing the pathway analysis of more complex study designs cannot use pathway structures or handle the situation where the variable of interest is not defined for all samples. Such scenarios are common in longitudinal studies with so long follow up time that healthy controls are required to identify the effect of normal aging apart from the effect of disease development, which is not defined for controls. To address the need, we introduce a new method for Pathway Analysis of Longitudinal data (PAL), which is suitable for complex study designs, such as longitudinal data. The main advantages of PAL are the use of pathway structures and the suitability of the approach for study settings beyond currently available tools. We demonstrate the performance of PAL with simulated data and three longitudinal datasets related to the early development of type 1 diabetes, which involve different study designs and only subtle biological signals, and include both transcriptomic and proteomic data. An R package implementing PAL is publicly available at https://github.com/elolab/PAL .
通路分析是涉及基因或蛋白质表达数据研究中的常见步骤,但大多数现有的通路分析方法是为两个样本组的简单病例对照研究设计的,没有进一步考虑复杂性。少数可用于更复杂研究设计的通路分析方法无法利用通路结构,也无法处理并非所有样本都定义了感兴趣变量的情况。这种情况在纵向研究中很常见,由于随访时间很长,需要健康对照来识别正常衰老的影响,而疾病发展对对照组来说是未定义的。为满足这一需求,我们引入了一种新的纵向数据通路分析方法(PAL),它适用于复杂的研究设计,如纵向数据。PAL的主要优点是利用了通路结构,并且该方法适用于现有工具之外的研究设置。我们用模拟数据和三个与1型糖尿病早期发展相关的纵向数据集展示了PAL的性能,这些数据集涉及不同的研究设计,且只有微弱的生物学信号,同时包括转录组学和蛋白质组学数据。一个实现PAL的R包可在https://github.com/elolab/PAL上公开获取。