Agniel Denis, Xie Wen, Essex Myron, Cai Tianxi
RAND Corporation, 1776 Main St., Santa Monica, California 90401, USA.
Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, 655 Huntington Ave, Boston, Massachusetts 02115, USA.
Ann Appl Stat. 2018 Sep;12(3):1871-1893. doi: 10.1214/18-AOAS1135. Epub 2018 Sep 11.
HIV-1C is the most prevalent subtype of HIV-1 and accounts for over half of HIV-1 infections worldwide. Host genetic influence of HIV infection has been previously studied in HIV-1B, but little attention has been paid to the more prevalent subtype C. To understand the role of host genetics in HIV-1C disease progression, we perform a study to assess the association between longitudinally collected measures of disease and more than 100,000 genetic markers located on chromosome 6. The most common approach to analyzing longitudinal data in this context is linear mixed effects models, which may be overly simplistic in this case. On the other hand, existing flexible and nonparametric methods either require densely sampled points, restrict attention to a single SNP, lack testing procedures, or are cumbersome to fit on the genome-wide scale. We propose a functional principal variance component (FPVC) testing framework which captures the nonlinearity in the CD4 and viral load with low degrees of freedom and is fast enough to carry out thousands or millions of times. The FPVC testing unfolds in two stages. In the first stage, we summarize the markers of disease progression according to their major patterns of variation via functional principal components analysis (FPCA). In the second stage, we employ a simple working model and variance component testing to examine the association between the summaries of disease progression and a set of single nucleotide polymorphisms. We supplement this analysis with simulation results which indicate that FPVC testing can offer large power gains over the standard linear mixed effects model.
HIV-1C是HIV-1最普遍的亚型,占全球HIV-1感染的一半以上。此前已在HIV-1B中研究过宿主基因对HIV感染的影响,但对更普遍的C亚型关注较少。为了解宿主基因在HIV-1C疾病进展中的作用,我们开展了一项研究,以评估纵向收集的疾病指标与位于6号染色体上的100,000多个基因标记之间的关联。在这种情况下,分析纵向数据最常用的方法是线性混合效应模型,而在这种情况下该方法可能过于简单。另一方面,现有的灵活且非参数的方法要么需要密集采样点,要么只关注单个单核苷酸多态性(SNP),缺乏检验程序,要么在全基因组范围内拟合起来很麻烦。我们提出了一种功能主方差成分(FPVC)检验框架,该框架能够以低自由度捕捉CD4和病毒载量中的非线性,并且速度足够快,可以进行数千次或数百万次运算。FPVC检验分两个阶段进行。在第一阶段,我们通过功能主成分分析(FPCA)根据疾病进展标记的主要变异模式对其进行总结。在第二阶段,我们采用一个简单的工作模型和方差成分检验来检查疾病进展总结与一组单核苷酸多态性之间的关联。我们用模拟结果补充了这一分析,结果表明FPVC检验比标准线性混合效应模型能提供更大的功效。