Pere J C, Begaud B, Haramburu F, Albin H
Clin Pharmacol Ther. 1986 Oct;40(4):451-61. doi: 10.1038/clpt.1986.206.
Several standardized assessment procedures are currently used in the evaluation of adverse drug reactions (ADRs). Disagreement in rating ADRs can result from between-raters variability and between-methods differences in weighting the evidence. We eliminated between-raters variability by computer simulation of 1134 ADRs (including all the possible combinations of criteria currently used) and by automatic rating using different algorithms adapted from six published methods. Percentage agreement (Po) and weighted kappa test (kappa w) between pairs of methods are always better than with randomized scores, but the strength of agreement is only moderate (0.26 less than Po less than 0.59; 0.14 less than kappa w less than 0.51). The weightings of criteria are evaluated in terms of sensitivity, specificity, and predictive values. Criteria are neither sensitive (0.41 less than Se less than 0.70) nor specific (0.18 less than Sp less than 0.63) and have poor predictive values. Disagreements on weightings are considerable for three major criteria: timing of event, dechallenge, and alternative etiologic candidates. We discuss some ways of improving reliability of ADR diagnosis.
目前有几种标准化评估程序用于药物不良反应(ADR)的评估。ADR评级的分歧可能源于评估者之间的差异以及不同方法在权衡证据时的差异。我们通过对1134例ADR进行计算机模拟(包括当前使用的所有可能标准组合)以及使用从六种已发表方法改编的不同算法进行自动评级,消除了评估者之间的差异。成对方法之间的百分比一致性(Po)和加权kappa检验(kappa w)总是优于随机评分,但一致性强度仅为中等(0.26<Po<0.59;0.14<kappa w<0.51)。根据敏感性、特异性和预测值对标准的权重进行评估。标准既不敏感(0.41<Se<0.70)也不特异(0.18<Sp<0.63),且预测值较差。对于三个主要标准,即事件发生时间、撤药反应和其他病因候选因素,在权重方面存在相当大的分歧。我们讨论了一些提高ADR诊断可靠性的方法。