Suppr超能文献

迈向稳健的神经解剖学规范模型:样本量和协变量分布的影响。

Toward Robust Neuroanatomical Normative Models: Influence of Sample Size and Covariates Distributions.

作者信息

Elleaume Camille, Hebling Vieira Bruno, Floris Dorothea L, Langer Nicolas

出版信息

bioRxiv. 2025 Aug 31:2025.08.26.672402. doi: 10.1101/2025.08.26.672402.

Abstract

Normative models are increasingly used to characterize individual-level brain deviations in neuroimaging studies, but their performance depends heavily on the reference sample used for training or adaptation. In this study, we systematically investigated how sample size and covariate composition of the reference cohort influence model fit, deviation estimates, and clinical readouts in Alzheimer's disease (AD). Using a discovery dataset (OASIS-3, n = 1032), we trained models on healthy control (HC), subsamples ranging from 5 to 600 individuals, while varying age and sex distributions to simulate biases in reference populations. We further assessed the use of adaptive transfer learning by pre-training models on the UK Biobank (n = 42,747) and adapting them to the clinical dataset applying the same sub-sampling strategies. We evaluated model performance on a fixed HC test set and quantified deviation score errors, outlier detection, and classification accuracy in both the HC test set and the AD cohort. The findings were replicated in an external validation sample (AIBL, n = 463). Across all settings, model performance improved with increasing sample size, but demographic alignment of the covariates, particularly in age, was essential for reliable deviation estimates. Models trained directly within the dataset achieved stable fit with approximately 200 HCs, while adapted models reached comparable performance with as few as 50 individuals when pre-trained on large-scale data. These results show that robust individual-level modeling can be achieved using moderately sized but demographically matched cohorts, supporting broader application of normative modeling in ageing and neurodegeneration research.

摘要

在神经影像学研究中,越来越多地使用规范模型来表征个体水平的脑偏差,但其性能在很大程度上取决于用于训练或适配的参考样本。在本研究中,我们系统地研究了参考队列的样本量和协变量组成如何影响阿尔茨海默病(AD)中的模型拟合、偏差估计和临床读数。使用一个发现数据集(OASIS - 3,n = 1032),我们在健康对照(HC)、从5到600人的子样本上训练模型,同时改变年龄和性别分布以模拟参考人群中的偏差。我们进一步评估了自适应迁移学习的使用,即先在英国生物银行(n = 42,747)上预训练模型,然后应用相同的子采样策略将其适配到临床数据集。我们在一个固定的HC测试集上评估模型性能,并量化HC测试集和AD队列中的偏差评分误差、异常值检测和分类准确性。这些发现在一个外部验证样本(AIBL,n = 463)中得到了重复。在所有设置中,模型性能随着样本量的增加而提高,但协变量的人口统计学一致性,特别是年龄方面,对于可靠的偏差估计至关重要。在数据集中直接训练的模型在大约200名HC时实现了稳定拟合,而在大规模数据上预训练后,适配模型在仅有50名个体时就能达到 comparable 性能。这些结果表明,使用规模适中但人口统计学匹配的队列可以实现强大的个体水平建模,支持规范建模在衰老和神经退行性疾病研究中的更广泛应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d13/12416260/6646a0f46100/nihpp-2025.08.26.672402v2-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验