Reproductive Medicine Centre, The Affiliated Chenggong Hospital of Xiamen University, Xiamen, Fujian, China.
School of Medicine, Xiamen University, Xiamen, Fujian, China.
Hum Reprod. 2024 Feb 1;39(2):364-373. doi: 10.1093/humrep/dead242.
What was the performance of different pretreatment prediction models for IVF, which were developed based on UK/US population (McLernon 2016 model, Luke model, Dhillon model, and McLernon 2022 model), in wider populations?
For a patient in China, the published pretreatment prediction models based on the UK/US population provide similar discriminatory power with reasonable AUCs and underestimated predictions.
Several pretreatment prediction models for IVF allow patients and clinicians to estimate the cumulative probability of live birth in a cycle before the treatment, but they are mostly based on the population of Europe or the USA, and their performance and applicability in the countries and regions beyond these regions are largely unknown.
STUDY DESIGN, SIZE, DURATION: A total of 26 382 Chinese patients underwent oocyte pick-up cycles between January 2013 and December 2020.
PARTICIPANTS/MATERIALS, SETTING, METHODS: UK/US model performance was externally validated according to the coefficients and intercepts they provided. Centre-specific models were established with XGboost, Lasso, and generalized linear model algorithms. Discriminatory power and calibration of the models were compared as the forms of the AUC of the Receiver Operator Characteristic and calibration curves.
The AUCs for McLernon 2016 model, Luke model, Dhillon model, and McLernon 2022 model were 0.69 (95% CI 0.68-0.69), 0.67 (95% CI 0.67-0.68), 0.69 (95% CI 0.68-0.69), and 0.67 (95% CI 0.67-0.68), respectively. The centre-specific yielded an AUC of 0.71 (95% CI 0.71-0.72) with key predictors including age, duration of infertility, and endocrine parameters. All external models suggested underestimation. Among the external models, the rescaled McLernon 2022 model demonstrated the best calibration (Slope 1.12, intercept 0.06).
LIMITATIONS, REASONS FOR CAUTION: The study is limited by its single-centre design and may not be representative elsewhere. Only per-complete cycle validation was carried out to provide a similar framework to compare different models in the sample population. Newer predictors, such as AMH, were not used.
Existing pretreatment prediction models for IVF may be used to provide useful discriminatory power in populations different from those on which they were developed. However, models based on newer more relevant datasets may provide better calibrations.
STUDY FUNDING/COMPETING INTEREST(S): This work was supported by the National Natural Science Foundation of China [grant number 22176159], the Xiamen Medical Advantage Subspecialty Construction Project [grant number 2018296], and the Special Fund for Clinical and Scientific Research of Chinese Medical Association [grant number 18010360765].
N/A.
基于英国/美国人群(McLernon 2016 模型、Luke 模型、Dhillon 模型和 McLernon 2022 模型)开发的不同预处理预测模型在更广泛的人群中的表现如何?
对于中国的患者,基于英国/美国人群的发表的预处理预测模型提供了相似的区分能力,具有合理的 AUC 和低估的预测值。
一些体外受精的预处理预测模型允许患者和临床医生在治疗前估计周期内活产的累积概率,但它们主要基于欧洲或美国的人群,其在这些地区以外的国家和地区的性能和适用性在很大程度上是未知的。
研究设计、大小和持续时间:共有 26382 名中国患者在 2013 年 1 月至 2020 年 12 月期间接受了卵母细胞采集周期。
参与者/材料、设置、方法:根据他们提供的系数和截距,对英国/美国模型的性能进行了外部验证。使用 XGboost、Lasso 和广义线性模型算法建立了中心特异性模型。比较了模型的判别能力和校准情况,表现形式为接收者操作特征曲线的 AUC 和校准曲线。
McLernon 2016 模型、Luke 模型、Dhillon 模型和 McLernon 2022 模型的 AUC 分别为 0.69(95%CI 0.68-0.69)、0.67(95%CI 0.67-0.68)、0.69(95%CI 0.68-0.69)和 0.67(95%CI 0.67-0.68)。中心特异性模型的 AUC 为 0.71(95%CI 0.71-0.72),主要预测因素包括年龄、不孕持续时间和内分泌参数。所有外部模型都表明存在低估。在外部模型中,经过重新缩放的 McLernon 2022 模型表现出最佳的校准效果(斜率 1.12,截距 0.06)。
局限性、谨慎的原因:该研究受到其单中心设计的限制,可能无法在其他地方具有代表性。仅进行了完整周期的验证,以提供在样本人群中比较不同模型的类似框架。没有使用新的预测因子,如 AMH。
现有的体外受精预处理预测模型在与开发模型不同的人群中可能具有有用的区分能力。然而,基于更新的、更相关数据集的模型可能提供更好的校准效果。
研究资金/竞争利益:本工作得到国家自然科学基金[资助号 22176159]、厦门市医学优势亚专科建设项目[资助号 2018296]和中华医学会临床科研专项基金[资助号 18010360765]的支持。
无。