Guern Anne-Sophie, Vinh-Hung Vincent
Ecole nationale de la statistique et de l'analyse, rue Blaise-Pascal, 35170 Bruz.
Bull Cancer. 2008 Apr;95(4):449-55. doi: 10.1684/bdc.2008.0620.
Our aim is to characterize the statistical distribution of the number of involved lymph nodes in breast cancer. The material uses a sample of 109618 women from the US SEER (Surveillance, Epidemiology, and End Results). In a first analysis, we observed a log-concave distribution with overdispersion which excluded a Poisson stochastic process. A Negative Binomial (NB) provided an acceptable fit. Overdispersion implies that there are patients who are more at risk than expected, and/or cascade processes in which the variability increases when there are more involved lymph nodes. In a second series of analyses, we applied predictive models taking into account or not the NB. Logistic models, commonly used, allow only the prediction of nodal status, and we found a poor predictive value. A NB generalized linear regression (NBGLR) allowed us to model the number of involved nodes. We argued that the approach of modeling the number of nodes, and not merely the nodal status, allows a grading of nodal involvement risk and might identify patients for whom neoadjuvant treatment would be justified. Incidentally, the NBGLR found in our sample a seasonal factor affecting the numbers of nodes, suggesting the variability of medical practice, which might warrant further investigation.
我们的目标是描述乳腺癌中受累淋巴结数量的统计分布情况。研究材料采用了来自美国监测、流行病学和最终结果(SEER)数据库的109618名女性样本。在首次分析中,我们观察到一种具有过度离散的对数凹分布,这排除了泊松随机过程。负二项分布(NB)提供了可接受的拟合。过度离散意味着存在比预期风险更高的患者,和/或存在级联过程,即当受累淋巴结数量增加时变异性也会增加。在第二系列分析中,我们应用了考虑或不考虑NB的预测模型。常用的逻辑模型仅能预测淋巴结状态,而我们发现其预测价值较差。负二项广义线性回归(NBGLR)使我们能够对受累淋巴结数量进行建模。我们认为,对淋巴结数量而非仅仅对淋巴结状态进行建模的方法,能够对淋巴结受累风险进行分级,并且可能识别出适合新辅助治疗的患者。顺便提一下,NBGLR在我们的样本中发现了一个影响淋巴结数量的季节性因素,这表明医疗实践存在变异性,可能值得进一步研究。