Department of Pediatrics, Alberta Research Centre for Health Evidence, University of Alberta, Edmonton Clinic Health Academy, 11405-87 Avenue NW, Edmonton, Alberta, T6G 1C9, Canada.
Syst Rev. 2023 Mar 21;12(1):51. doi: 10.1186/s13643-023-02181-w.
To inform recommendations by the Canadian Task Force on Preventive Health Care, we reviewed evidence on the benefits, harms, and acceptability of screening and treatment, and on the accuracy of risk prediction tools for the primary prevention of fragility fractures among adults aged 40 years and older in primary care.
For screening effectiveness, accuracy of risk prediction tools, and treatment benefits, our search methods involved integrating studies published up to 2016 from an existing systematic review. Then, to locate more recent studies and any evidence relating to acceptability and treatment harms, we searched online databases (2016 to April 4, 2022 [screening] or to June 1, 2021 [predictive accuracy]; 1995 to June 1, 2021, for acceptability; 2016 to March 2, 2020, for treatment benefits; 2015 to June 24, 2020, for treatment harms), trial registries and gray literature, and hand-searched reviews, guidelines, and the included studies. Two reviewers selected studies, extracted results, and appraised risk of bias, with disagreements resolved by consensus or a third reviewer. The overview of reviews on treatment harms relied on one reviewer, with verification of data by another reviewer to correct errors and omissions. When appropriate, study results were pooled using random effects meta-analysis; otherwise, findings were described narratively. Evidence certainty was rated according to the GRADE approach.
We included 4 randomized controlled trials (RCTs) and 1 controlled clinical trial (CCT) for the benefits and harms of screening, 1 RCT for comparative benefits and harms of different screening strategies, 32 validation cohort studies for the calibration of risk prediction tools (26 of these reporting on the Fracture Risk Assessment Tool without [i.e., clinical FRAX], or with the inclusion of bone mineral density (BMD) results [i.e., FRAX + BMD]), 27 RCTs for the benefits of treatment, 10 systematic reviews for the harms of treatment, and 12 studies for the acceptability of screening or initiating treatment. In females aged 65 years and older who are willing to independently complete a mailed fracture risk questionnaire (referred to as "selected population"), 2-step screening using a risk assessment tool with or without measurement of BMD probably (moderate certainty) reduces the risk of hip fractures (3 RCTs and 1 CCT, n = 43,736, absolute risk reduction [ARD] = 6.2 fewer in 1000, 95% CI 9.0-2.8 fewer, number needed to screen [NNS] = 161) and clinical fragility fractures (3 RCTs, n = 42,009, ARD = 5.9 fewer in 1000, 95% CI 10.9-0.8 fewer, NNS = 169). It probably does not reduce all-cause mortality (2 RCTs and 1 CCT, n = 26,511, ARD = no difference in 1000, 95% CI 7.1 fewer to 5.3 more) and may (low certainty) not affect health-related quality of life. Benefits for fracture outcomes were not replicated in an offer-to-screen population where the rate of response to mailed screening questionnaires was low. For females aged 68-80 years, population screening may not reduce the risk of hip fractures (1 RCT, n = 34,229, ARD = 0.3 fewer in 1000, 95% CI 4.2 fewer to 3.9 more) or clinical fragility fractures (1 RCT, n = 34,229, ARD = 1.0 fewer in 1000, 95% CI 8.0 fewer to 6.0 more) over 5 years of follow-up. The evidence for serious adverse events among all patients and for all outcomes among males and younger females (<65 years) is very uncertain. We defined overdiagnosis as the identification of high risk in individuals who, if not screened, would never have known that they were at risk and would never have experienced a fragility fracture. This was not directly reported in any of the trials. Estimates using data available in the trials suggest that among "selected" females offered screening, 12% of those meeting age-specific treatment thresholds based on clinical FRAX 10-year hip fracture risk, and 19% of those meeting thresholds based on clinical FRAX 10-year major osteoporotic fracture risk, may be overdiagnosed as being at high risk of fracture. Of those identified as being at high clinical FRAX 10-year hip fracture risk and who were referred for BMD assessment, 24% may be overdiagnosed. One RCT (n = 9268) provided evidence comparing 1-step to 2-step screening among postmenopausal females, but the evidence from this trial was very uncertain. For the calibration of risk prediction tools, evidence from three Canadian studies (n = 67,611) without serious risk of bias concerns indicates that clinical FRAX-Canada may be well calibrated for the 10-year prediction of hip fractures (observed-to-expected fracture ratio [O:E] = 1.13, 95% CI 0.74-1.72, I = 89.2%), and is probably well calibrated for the 10-year prediction of clinical fragility fractures (O:E = 1.10, 95% CI 1.01-1.20, I = 50.4%), both leading to some underestimation of the observed risk. Data from these same studies (n = 61,156) showed that FRAX-Canada with BMD may perform poorly to estimate 10-year hip fracture risk (O:E = 1.31, 95% CI 0.91-2.13, I = 92.7%), but is probably well calibrated for the 10-year prediction of clinical fragility fractures, with some underestimation of the observed risk (O:E 1.16, 95% CI 1.12-1.20, I = 0%). The Canadian Association of Radiologists and Osteoporosis Canada Risk Assessment (CAROC) tool may be well calibrated to predict a category of risk for 10-year clinical fractures (low, moderate, or high risk; 1 study, n = 34,060). The evidence for most other tools was limited, or in the case of FRAX tools calibrated for countries other than Canada, very uncertain due to serious risk of bias concerns and large inconsistency in findings across studies. Postmenopausal females in a primary prevention population defined as <50% prevalence of prior fragility fracture (median 16.9%, range 0 to 48% when reported in the trials) and at risk of fragility fracture, treatment with bisphosphonates as a class (median 2 years, range 1-6 years) probably reduces the risk of clinical fragility fractures (19 RCTs, n = 22,482, ARD = 11.1 fewer in 1000, 95% CI 15.0-6.6 fewer, [number needed to treat for an additional beneficial outcome] NNT = 90), and may reduce the risk of hip fractures (14 RCTs, n = 21,038, ARD = 2.9 fewer in 1000, 95% CI 4.6-0.9 fewer, NNT = 345) and clinical vertebral fractures (11 RCTs, n = 8921, ARD = 10.0 fewer in 1000, 95% CI 14.0-3.9 fewer, NNT = 100); it may not reduce all-cause mortality. There is low certainty evidence of little-to-no reduction in hip fractures with any individual bisphosphonate, but all provided evidence of decreased risk of clinical fragility fractures (moderate certainty for alendronate [NNT=68] and zoledronic acid [NNT=50], low certainty for risedronate [NNT=128]) among postmenopausal females. Evidence for an impact on risk of clinical vertebral fractures is very uncertain for alendronate and risedronate; zoledronic acid may reduce the risk of this outcome (4 RCTs, n = 2367, ARD = 18.7 fewer in 1000, 95% CI 25.6-6.6 fewer, NNT = 54) for postmenopausal females. Denosumab probably reduces the risk of clinical fragility fractures (6 RCTs, n = 9473, ARD = 9.1 fewer in 1000, 95% CI 12.1-5.6 fewer, NNT = 110) and clinical vertebral fractures (4 RCTs, n = 8639, ARD = 16.0 fewer in 1000, 95% CI 18.6-12.1 fewer, NNT=62), but may make little-to-no difference in the risk of hip fractures among postmenopausal females. Denosumab probably makes little-to-no difference in the risk of all-cause mortality or health-related quality of life among postmenopausal females. Evidence in males is limited to two trials (1 zoledronic acid, 1 denosumab); in this population, zoledronic acid may make little-to-no difference in the risk of hip or clinical fragility fractures, and evidence for all-cause mortality is very uncertain. The evidence for treatment with denosumab in males is very uncertain for all fracture outcomes (hip, clinical fragility, clinical vertebral) and all-cause mortality. There is moderate certainty evidence that treatment causes a small number of patients to experience a non-serious adverse event, notably non-serious gastrointestinal events (e.g., abdominal pain, reflux) with alendronate (50 RCTs, n = 22,549, ARD = 16.3 more in 1000, 95% CI 2.4-31.3 more, [number needed to treat for an additional harmful outcome] NNH = 61) but not with risedronate; influenza-like symptoms with zoledronic acid (5 RCTs, n = 10,695, ARD = 142.5 more in 1000, 95% CI 105.5-188.5 more, NNH = 7); and non-serious gastrointestinal adverse events (3 RCTs, n = 8454, ARD = 64.5 more in 1000, 95% CI 26.4-13.3 more, NNH = 16), dermatologic adverse events (3 RCTs, n = 8454, ARD = 15.6 more in 1000, 95% CI 7.6-27.0 more, NNH = 64), and infections (any severity; 4 RCTs, n = 8691, ARD = 1.8 more in 1000, 95% CI 0.1-4.0 more, NNH = 556) with denosumab. For serious adverse events overall and specific to stroke and myocardial infarction, treatment with bisphosphonates probably makes little-to-no difference; evidence for other specific serious harms was less certain or not available. There was low certainty evidence for an increased risk for the rare occurrence of atypical femoral fractures (0.06 to 0.08 more in 1000) and osteonecrosis of the jaw (0.22 more in 1000) with bisphosphonates (most evidence for alendronate). The evidence for these rare outcomes and for rebound fractures with denosumab was very uncertain. Younger (lower risk) females have high willingness to be screened. A minority of postmenopausal females at increased risk for fracture may accept treatment. Further, there is large heterogeneity in the level of risk at which patients may be accepting of initiating treatment, and treatment effects appear to be overestimated.
An offer of 2-step screening with risk assessment and BMD measurement to selected postmenopausal females with low prevalence of prior fracture probably results in a small reduction in the risk of clinical fragility fracture and hip fracture compared to no screening. These findings were most applicable to the use of clinical FRAX for risk assessment and were not replicated in the offer-to-screen population where the rate of response to mailed screening questionnaires was low. Limited direct evidence on harms of screening were available; using study data to provide estimates, there may be a moderate degree of overdiagnosis of high risk for fracture to consider. The evidence for younger females and males is very limited. The benefits of screening and treatment need to be weighed against the potential for harm; patient views on the acceptability of treatment are highly variable.
International Prospective Register of Systematic Reviews (PROSPERO): CRD42019123767.
为了为加拿大预防保健工作组提供建议,我们审查了有关初级保健中脆性骨折高危成年人(年龄≥40 岁)筛查和治疗的益处、危害和可接受性,以及风险预测工具准确性的证据。
对于筛查效果、风险预测工具的准确性和治疗益处,我们的搜索方法包括整合了截止到 2016 年的已发表的系统评价研究,以及查找最近的研究和任何与可接受性和治疗危害相关的证据。为此,我们在在线数据库(筛查:2016 年 4 月 4 日或 2022 年 6 月 1 日[预测准确性];可接受性:1995 年 6 月 1 日至 2021 年 6 月 1 日;治疗益处:2016 年 3 月 2 日至 2020 年;治疗危害:2015 年 6 月 24 日至 2020 年 6 月 24 日)、试验登记处和灰色文献中搜索,并通过共识或第三审查员解决分歧来审查研究结果。对于治疗危害的综述,由一名审查员进行,另一名审查员对数据进行验证,以纠正错误和遗漏。如果合适,使用随机效应荟萃分析对研究结果进行汇总;否则,对结果进行描述性分析。证据确定性根据 GRADE 方法进行评级。
我们纳入了 4 项随机对照试验(RCT)和 1 项对照临床试验,用于筛查的益处和危害;1 项 RCT 用于比较不同筛查策略的益处和危害;32 项验证性队列研究用于风险预测工具的校准(其中 26 项报告了不包括骨密度[即 FRAX 加 BMD]的骨折风险评估工具[即 FRAX]);27 项 RCT 用于治疗的益处;10 项系统评价用于治疗危害;12 项研究用于筛查或开始治疗的可接受性。在愿意独立完成邮寄骨折风险问卷的 65 岁以上女性(称为“选择人群”)中,两步筛查联合使用风险评估工具(有或无骨密度测量)可能会(中度确定性)降低髋部骨折(3 项 RCT 和 1 项 CCT,n=43736,绝对风险减少[ARD]为每 1000 人少 6.2 人,95%置信区间 9.0-2.8 人,需要筛查的人数[NNS]=161)和临床脆性骨折(3 项 RCT,n=42009,ARD 为每 1000 人少 5.9 人,95%置信区间 10.9-0.8 人,NNS=169)的风险。它可能不会降低全因死亡率(2 项 RCT 和 1 项 CCT,n=26511,ARD 无差异,95%置信区间 7.1 人少至 5.3 人多),也可能不会影响健康相关的生活质量。在对邮寄筛查问卷的回复率较低的“选择人群”中,提供筛查的效果并未在骨折结局中重现。对于 68-80 岁的女性,人群筛查可能不会降低髋部骨折(1 项 RCT,n=34229,ARD 为每 1000 人少 0.3 人,95%置信区间 4.2 人少至 3.9 人多)或临床脆性骨折(1 项 RCT,n=34229,ARD 为每 1000 人少 1.0 人,95%置信区间 8.0 人少至 6.0 人多)的风险,随访 5 年。关于所有患者的严重不良事件证据和男性和年轻女性(<65 岁)的所有结果证据均非常不确定。我们将过度诊断定义为识别出高危人群,他们如果不接受筛查,就不会知道自己有骨折风险,也不会经历脆性骨折。在任何试验中都没有直接报告这些情况。利用试验中可用的数据估计表明,在接受筛查的“选择”女性中,基于临床 FRAX 10 年髋部骨折风险的特定治疗阈值(基于年龄特异性 FRAX 10 年髋部骨折风险为 10%),12%的人可能被过度诊断为有骨折风险,基于临床 FRAX 10 年主要骨质疏松性骨折风险的特定治疗阈值(基于年龄特异性 FRAX 10 年髋部骨折风险为 19%),可能被过度