Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, Texas, USA.
Division of Clinical and Translational Sciences, Department of Internal Medicine, University of Texas Health Science Center at Houston, Houston, Texas, USA.
Biometrics. 2023 Jun;79(2):1573-1585. doi: 10.1111/biom.13636. Epub 2022 Feb 24.
The rapid acceleration of genetic data collection in biomedical settings has recently resulted in the rise of genetic compendiums filled with rich longitudinal disease data. One common feature of these data sets is their plethora of interval-censored outcomes. However, very few tools are available for the analysis of genetic data sets with interval-censored outcomes, and in particular, there is a lack of methodology available for set-based inference. Set-based inference is used to associate a gene, biological pathway, or other genetic construct with outcomes and is one of the most popular strategies in genetics research. This work develops three such tests for interval-censored settings beginning with a variance components test for interval-censored outcomes, the interval-censored sequence kernel association test (ICSKAT). We also provide the interval-censored version of the Burden test, and then we integrate ICSKAT and Burden to construct the interval censored sequence kernel association test-optimal (ICSKATO) combination. These tests unlock set-based analysis of interval-censored data sets with analogs of three highly popular set-based tools commonly applied to continuous and binary outcomes. Simulation studies illustrate the advantages of the developed methods over ad hoc alternatives, including protection of the type I error rate at very low levels and increased power. The proposed approaches are applied to the investigation that motivated this study, an examination of the genes associated with bone mineral density deficiency and fracture risk.
生物医学环境中基因数据的快速积累最近导致了充满丰富纵向疾病数据的遗传纲要的出现。这些数据集的一个共同特点是它们有大量的区间删失结果。然而,用于分析具有区间删失结果的遗传数据集的工具非常少,特别是缺乏基于集合的推断方法。基于集合的推断用于将基因、生物途径或其他遗传结构与结果相关联,是遗传学研究中最流行的策略之一。这项工作从用于区间删失结果的方差分量检验开始,为区间删失设置开发了三种这样的检验,即区间删失序列核关联检验(ICSKAT)。我们还提供了负担检验的区间删失版本,然后将 ICSKAT 和 Burden 整合起来构建区间删失序列核关联检验最优(ICSKATO)组合。这些检验为区间删失数据集的基于集合的分析解锁了新的途径,具有类似于三种非常流行的、常用于连续和二进制结果的基于集合的工具。模拟研究表明,与特定替代方案相比,所开发的方法具有优势,包括在非常低的水平上保护第一类错误率和提高功效。所提出的方法应用于激励这项研究的调查,即研究与骨矿物质密度不足和骨折风险相关的基因。