Hackenberg Maren, Pfaffenlehner Michelle, Behrens Max, Pechmann Astrid, Kirschner Janbernd, Binder Harald
Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany.
Freiburg Center for Data Analysis and Modeling, University of Freiburg, Freiburg, Germany.
Biom J. 2025 Feb;67(1):e70023. doi: 10.1002/bimj.70023.
In a longitudinal clinical registry, different measurement instruments might have been used for assessing individuals at different time points. To combine them, we investigate deep learning techniques for obtaining a joint latent representation, to which the items of different measurement instruments are mapped. This corresponds to domain adaptation, an established concept in computer science for image data. Using the proposed approach as an example, we evaluate the potential of domain adaptation in a longitudinal cohort setting with a rather small number of time points, motivated by an application with different motor function measurement instruments in a registry of spinal muscular atrophy (SMA) patients. There, we model trajectories in the latent representation by ordinary differential equations (ODEs), where person-specific ODE parameters are inferred from baseline characteristics. The goodness of fit and complexity of the ODE solutions then allow to judge the measurement instrument mappings. We subsequently explore how alignment can be improved by incorporating corresponding penalty terms into model fitting. To systematically investigate the effect of differences between measurement instruments, we consider several scenarios based on modified SMA data, including scenarios where a mapping should be feasible in principle and scenarios where no perfect mapping is available. While misalignment increases in more complex scenarios, some structure is still recovered, even if the availability of measurement instruments depends on patient state. A reasonable mapping is feasible also in the more complex real SMA data set. These results indicate that domain adaptation might be more generally useful in statistical modeling for longitudinal registry data.
在一个纵向临床登记系统中,不同的测量工具可能被用于在不同时间点评估个体。为了将它们结合起来,我们研究深度学习技术以获得一个联合潜在表示,不同测量工具的项目被映射到该表示上。这对应于领域自适应,这是计算机科学中针对图像数据的一个既定概念。以所提出的方法为例,我们在时间点数量相当少的纵向队列设置中评估领域自适应的潜力,这是受脊髓性肌萎缩症(SMA)患者登记系统中不同运动功能测量工具的应用所推动。在那里,我们通过常微分方程(ODE)对潜在表示中的轨迹进行建模,其中个体特定的ODE参数是从基线特征推断出来的。ODE解的拟合优度和复杂性随后允许判断测量工具的映射。我们随后探索如何通过将相应的惩罚项纳入模型拟合来改善对齐。为了系统地研究测量工具之间差异的影响,我们基于修改后的SMA数据考虑了几种情况,包括原则上映射应该可行的情况以及没有完美映射可用的情况。虽然在更复杂的情况下未对齐会增加,但即使测量工具的可用性取决于患者状态,一些结构仍然可以恢复。在更复杂的真实SMA数据集中,合理的映射也是可行的。这些结果表明,领域自适应在纵向登记数据的统计建模中可能更普遍有用。