Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.
Department of Data Science, Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea.
Genet Epidemiol. 2021 Apr;45(3):293-304. doi: 10.1002/gepi.22370. Epub 2020 Nov 8.
Recent advances in genotyping and sequencing technologies have enabled genetic association studies to leverage high-quality genotyped data to identify variants accounting for a substantial portion of disease risk. The usage of external controls, whose genomes have already been genotyped and are publicly available, could be a cost-effective approach to increase the power of association testing. There has been recent effort to integrate external controls while adjusting for possible batch effects, such as the integrating External Controls into Association Test (iECAT). The original iECAT test, however, cannot adjust for covariates such as age, gender, and so forth. Hence, based on the insight of iECAT, we propose a novel score-based test that allows for covariate adjustment and constructs a shrinkage score statistic that is a weighted sum of the score statistics using exclusively internal samples and uses both internal and external control samples. We assess the existence of batch effect at a variant by comparing control samples of internal and external sources. We show by simulation studies that our method has increased power over the original iECAT while controlling for type I error rates. We present the application of our method to the association studies of age-related macular degeneration (AMD) utilizing data from the International AMD Genomics Consortium and Michigan Genomics Initiative. Through the incorporation of the score test approach, we extend the use of iECAT to adjust for covariates and improve power, further honing the statistical methods needed to identify disease-causing variants within the human genome.
近年来,基因分型和测序技术的进步使得遗传关联研究能够利用高质量的基因分型数据来识别导致大部分疾病风险的变体。使用外部对照,其基因组已经被基因分型并公开可用,这可能是一种具有成本效益的方法,可以提高关联测试的功效。最近有人努力整合外部对照,同时调整可能的批次效应,例如将外部对照整合到关联测试(iECAT)中。然而,原始的 iECAT 测试不能调整协变量,例如年龄、性别等。因此,基于 iECAT 的洞察力,我们提出了一种新的基于评分的测试方法,允许进行协变量调整,并构建一个收缩评分统计量,该统计量是使用内部样本的评分统计量的加权和,同时使用内部和外部对照样本。我们通过比较内部和外部对照样本来评估变体是否存在批次效应。通过模拟研究,我们表明,我们的方法在控制第一类错误率的同时,比原始的 iECAT 提高了功效。我们展示了我们的方法在利用国际年龄相关性黄斑变性(AMD)基因组学联盟和密歇根基因组倡议的数据进行 AMD 关联研究中的应用。通过纳入评分检验方法,我们扩展了 iECAT 的用途,以调整协变量并提高功效,进一步完善了识别人类基因组中致病变体所需的统计方法。