Yap Jurel K, Gauran Iris Ivy M
School of Statistics, University of the Philippines Diliman, Quezon City, Philippines.
School of Government, Ateneo de Manila University, Quezon City, Philippines.
Comput Stat. 2022 Sep 18:1-20. doi: 10.1007/s00180-022-01283-8.
Given the costliness of HIV drug therapy research, it is important not only to maximize true positive rate (TPR) by identifying which genetic markers are related to drug resistance, but also to minimize false discovery rate (FDR) by reducing the number of incorrect markers unrelated to drug resistance. In this study, we propose a multiple testing procedure that unifies key concepts in computational statistics, namely Model-free Knockoffs, Bayesian variable selection, and the local false discovery rate. We develop an algorithm that utilizes the augmented data-Knockoff matrix and implement Bayesian Lasso. We then identify signals using test statistics based on Markov Chain Monte Carlo outputs and local false discovery rate. We test our proposed methods against non-bayesian methods such as Benjamini-Hochberg (BHq) and Lasso regression in terms TPR and FDR. Using numerical studies, we show the proposed method yields lower FDR compared to BHq and Lasso for certain cases, such as for low and equi-dimensional cases. We also discuss an application to an HIV-1 data set, which aims to be applied analyzing genetic markers linked to drug resistant HIV in the Philippines in future work.
鉴于HIV药物治疗研究成本高昂,不仅要通过识别哪些基因标记与耐药性相关来最大化真阳性率(TPR),还要通过减少与耐药性无关的错误标记数量来最小化错误发现率(FDR)。在本研究中,我们提出了一种多重检验程序,该程序统一了计算统计学中的关键概念,即无模型仿样、贝叶斯变量选择和局部错误发现率。我们开发了一种利用增强数据 - 仿样矩阵并实现贝叶斯套索的算法。然后,我们使用基于马尔可夫链蒙特卡罗输出和局部错误发现率的检验统计量来识别信号。我们在TPR和FDR方面,将我们提出的方法与非贝叶斯方法(如Benjamini - Hochberg(BHq)和套索回归)进行比较测试。通过数值研究,我们表明,在某些情况下,如低维和等维情况下,与BHq和套索相比,我们提出的方法产生的FDR更低。我们还讨论了在HIV - 1数据集上的应用,该应用旨在未来的工作中用于分析菲律宾与耐药性HIV相关的基因标记。