Novartis Pharmaceuticals Corporation, East Hanover, New Jersey, USA.
Oxford University, Oxford, UK.
Stat Med. 2021 Jun 30;40(14):3313-3328. doi: 10.1002/sim.8955. Epub 2021 Apr 25.
Knockoffs provide a general framework for controlling the false discovery rate when performing variable selection. Much of the Knockoffs literature focuses on theoretical challenges and we recognize a need for bringing some of the current ideas into practice. In this paper we propose a sequential algorithm for generating knockoffs when underlying data consists of both continuous and categorical (factor) variables. Further, we present a heuristic multiple knockoffs approach that offers a practical assessment of how robust the knockoff selection process is for a given dataset. We conduct extensive simulations to validate performance of the proposed methodology. Finally, we demonstrate the utility of the methods on a large clinical data pool of more than 2000 patients with psoriatic arthritis evaluated in four clinical trials with an IL-17A inhibitor, secukinumab (Cosentyx), where we determine prognostic factors of a well established clinical outcome. The analyses presented in this paper could provide a wide range of applications to commonly encountered datasets in medical practice and other fields where variable selection is of particular interest.
仿冒品为执行变量选择时控制错误发现率提供了一个通用框架。 仿冒品文献主要关注理论挑战,我们认识到需要将一些当前的想法付诸实践。 在本文中,我们提出了一种当基础数据包含连续和分类(因子)变量时生成仿冒品的顺序算法。 此外,我们提出了一种启发式多重仿冒品方法,为给定数据集提供了对仿冒品选择过程稳健性的实际评估。 我们进行了广泛的模拟以验证所提出方法的性能。 最后,我们在一个包含 2000 多名患有银屑病关节炎的患者的大型临床数据集中展示了这些方法的实用性,这些患者在四项针对 IL-17A 抑制剂 secukinumab(Cosentyx)的临床试验中进行了评估,我们确定了一种既定临床结果的预后因素。 本文提出的分析可以为医学实践和其他领域中常见的数据集提供广泛的应用,在这些领域中,变量选择特别重要。