Institut National du Sport de l'Expertise et de la Performance, Institut de Recherche bioMédicale et d'Épidémilogie du Sport (IRMES), France.
Université Paris-Sud Orsay, France and Université Paris Descartes, Paris, France.
Biostatistics. 2019 Jan 1;20(1):48-64. doi: 10.1093/biostatistics/kxx044.
The clinical and biological follow-up of individuals, such as the biological passport for athletes, is typically based on the individual and longitudinal monitoring of hematological or urine markers. These follow-ups aim to identify abnormal behavior by comparing the individual's biological samples to an established baseline. These comparisons may be done via different ways, but each of them requires an appropriate extra population to compute the significance levels, which is a non-trivial issue. Moreover, it is not necessarily relevant to compare the measures of a biomarker of a professional athlete to that of a reference population (even restricted to other athletes), and a reasonable alternative is to detect the abnormal values by considering only the other measurements of the same athlete. Here we propose a simple adaptive statistic based on maxima of Z-scores that does not rely on the use of an extra population. We show that, in the Gaussian framework, it is a practical and relevant method for detecting abnormal values in a series of observations from the same individual. The distribution of this statistic does not depend on the individual parameters under the null hypothesis, and its quantiles can be computed using Monte Carlo simulations. The proposed method is tested on the 3-year follow-up of ferritin, serum iron, erythrocytes, hemoglobin, and hematocrit markers in 2577 elite male soccer players. For instance, if we consider the abnormal values for the hematocrit at a 5% level, we found that 5.57% of the selected cohort had at least one abnormal value (which is not significantly different from the expected false-discovery rate). The approach is a starting point for more elaborate models that would produce a refined individual baseline. The method can be extended to the Gaussian linear model, in order to include additional variables such as the age or exposure to altitude. The method could also be applied to other domains, such as the clinical patient follow-up in monitoring abnormal values of biological markers.
个体的临床和生物学随访,例如运动员的生物护照,通常基于对血液或尿液标志物的个体和纵向监测。这些随访旨在通过将个体的生物样本与既定基线进行比较来识别异常行为。这些比较可以通过不同的方式进行,但每种方式都需要一个适当的额外人群来计算显著性水平,这是一个非平凡的问题。此外,将职业运动员的生物标志物的测量值与参考人群(甚至仅限于其他运动员)进行比较并不一定相关,一个合理的选择是通过仅考虑同一运动员的其他测量值来检测异常值。在这里,我们提出了一种基于 Z 分数最大值的简单自适应统计方法,该方法不依赖于使用额外的人群。我们表明,在高斯框架下,对于从同一个体的一系列观测中检测异常值,这是一种实用且相关的方法。在零假设下,该统计量的分布不依赖于个体参数,并且可以使用蒙特卡罗模拟计算其分位数。该方法在 2577 名精英男性足球运动员的铁蛋白、血清铁、红细胞、血红蛋白和血细胞比容标志物的 3 年随访中进行了测试。例如,如果我们考虑血细胞比容的异常值在 5%的水平,我们发现所选队列中有 5.57%的人至少有一个异常值(与预期的假发现率没有显著差异)。该方法是更精细模型的起点,这些模型将产生更精细的个体基线。该方法可以扩展到高斯线性模型,以便包括年龄或暴露于高海拔等其他变量。该方法还可以应用于其他领域,例如监测生物标志物异常值的临床患者随访。