Shauly-Aharonov Michal
The Hebrew University of Jerusalem, Jerusalem, Israel.
Stat Med. 2020 Apr 15;39(8):1041-1053. doi: 10.1002/sim.8460. Epub 2020 Jan 6.
In observational studies, it is agreed that the sensitivity of the findings to unmeasured confounders needs to be assessed. The issue is that a poor choice of test statistic can result in overstated sensitivity to hidden bias of this kind. In this article, a new adaptive test is proposed, guided by considerations of low sensitivity to hidden bias: it is tailored so that its power is greater than other leading tests, both in finite and infinite samples. One way of defining power in case of possible confounders is as the probability of reporting robustness (ie, insensitivity) of a true discovery to potential bias. In case of finite samples, we compute the power by simulations. When sample size approaches infinity, a meaningful indicator of the power is the design sensitivity, which is computed analytically and found to be better in the new test than in existing tests. Another asymptotic criterion for comparing tests when there is concern for confounders is Bahadur efficiency. The proposed test outperforms commonly used tests in terms of Bahadur efficiency in most sampling situations. The advantages of the new test mainly stem from its adaptivity: it combines two test statistics and consequently achieves the best design sensitivity and the best Bahadur efficiency of the two. As a "real-world" examination, we compare 441 daily smokers to 441 nonsmokers, to test the effect of smoking on periodontal disease. The new test is more robust to unmeasured confounders than both the Wilcoxon signed rank test and the paired t-test.
在观察性研究中,人们一致认为需要评估研究结果对未测量混杂因素的敏感性。问题在于,检验统计量选择不当可能导致对这类隐藏偏倚的敏感性被高估。在本文中,基于对隐藏偏倚低敏感性的考虑,提出了一种新的自适应检验:它经过了专门设计,使得其功效在有限样本和无限样本中都大于其他主要检验。在存在潜在混杂因素的情况下,定义功效的一种方法是将其作为报告真实发现对潜在偏倚具有稳健性(即不敏感性)的概率。对于有限样本,我们通过模拟来计算功效。当样本量趋近于无穷大时,功效的一个有意义的指标是设计敏感性,它通过解析计算得出,并且发现在新检验中比在现有检验中表现更好。当担心存在混杂因素时,比较检验的另一个渐近准则是巴哈杜尔效率。在大多数抽样情况下,所提出的检验在巴哈杜尔效率方面优于常用检验。新检验的优势主要源于其适应性:它结合了两个检验统计量,从而实现了两者中最佳的设计敏感性和最佳的巴哈杜尔效率。作为一项“实际应用”检验,我们将441名每日吸烟者与441名不吸烟者进行比较,以检验吸烟对牙周疾病的影响。新检验对未测量混杂因素的稳健性高于威尔科克森符号秩检验和配对t检验。