Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands.
Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands.
BMC Med Res Methodol. 2024 Mar 8;24(1):58. doi: 10.1186/s12874-024-02181-x.
There is divergence in the rate at which people age. The concept of biological age is postulated to capture this variability, and hence to better represent an individual's true global physiological state than chronological age. Biological age predictors are often generated based on cross-sectional data, using biochemical or molecular markers as predictor variables. It is assumed that the difference between chronological and predicted biological age is informative of one's chronological age-independent aging divergence ∆.
We investigated the statistical assumptions underlying the most popular cross-sectional biological age predictors, based on multiple linear regression, the Klemera-Doubal method or principal component analysis. We used synthetic and real data to illustrate the consequences if this assumption does not hold.
The most popular cross-sectional biological age predictors all use the same strong underlying assumption, namely that a candidate marker of aging's association with chronological age is directly informative of its association with the aging rate ∆. We called this the identical-association assumption and proved that it is untestable in a cross-sectional setting. If this assumption does not hold, weights assigned to candidate markers of aging are uninformative, and no more signal may be captured than if markers would have been assigned weights at random.
Cross-sectional methods for predicting biological age commonly use the untestable identical-association assumption, which previous literature in the field had never explicitly acknowledged. These methods have inherent limitations and may provide uninformative results, highlighting the importance of researchers exercising caution in the development and interpretation of cross-sectional biological age predictors.
人类衰老的速度存在差异。生物年龄的概念被假定可以捕捉到这种变异性,从而比实际年龄更好地代表个体的真实整体生理状态。生物年龄预测因子通常基于横断面数据生成,使用生化或分子标志物作为预测变量。假设实际年龄和预测生物年龄之间的差异可以反映一个人独立于实际年龄的衰老差异 ∆。
我们根据多元线性回归、Klemera-Doubal 方法或主成分分析,研究了最流行的横断面生物年龄预测因子所基于的统计假设。我们使用合成和真实数据来说明如果这个假设不成立会产生什么后果。
最流行的横断面生物年龄预测因子都使用了相同的强基本假设,即候选衰老标志物与实际年龄的关联直接反映其与衰老速度 ∆的关联。我们称之为相同关联假设,并证明在横断面设置中无法检验该假设。如果该假设不成立,分配给衰老候选标志物的权重是无信息的,并且可能不会比随机分配标志物获得更多的信号。
预测生物年龄的横断面方法通常使用未经检验的相同关联假设,该假设在该领域的先前文献中从未明确承认。这些方法具有内在的局限性,可能会提供无信息的结果,这凸显了研究人员在开发和解释横断面生物年龄预测因子时保持谨慎的重要性。