Suppr超能文献

基于家系设计的罕见变异关联测试的统一框架,包括高等批评方法、序列核关联检验(SKATs)和负担检验。

A unifying framework for rare variant association testing in family-based designs, including higher criticism approaches, SKATs, and burden tests.

作者信息

Hecker Julian, Townes F William, Kachroo Priyadarshini, Laurie Cecelia, Lasky-Su Jessica, Ziniti John, Cho Michael H, Weiss Scott T, Laird Nan M, Lange Christoph

机构信息

Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.

Department of Computer Science, Princeton University, Princeton, NJ 08540-5233, USA.

出版信息

Bioinformatics. 2021 Apr 1;36(22-23):5432-5438. doi: 10.1093/bioinformatics/btaa1055.

Abstract

MOTIVATION

Analysis of rare variants in family-based studies remains a challenge. Transmission-based approaches provide robustness against population stratification, but the evaluation of the significance of test statistics based on asymptotic theory can be imprecise. Also, power will depend heavily on the choice of the test statistic and on the underlying genetic architecture of the locus, which will be generally unknown.

RESULTS

In our proposed framework, we utilize the FBAT haplotype algorithm to obtain the conditional offspring genotype distribution under the null hypothesis given the sufficient statistic. Based on this conditional offspring genotype distribution, the significance of virtually any association test statistic can be evaluated based on simulations or exact computations, without the need for asymptotic approximations. Besides standard linear burden-type statistics, this enables our approach to also evaluate other test statistics such as variance components statistics, higher criticism approaches, and maximum-single-variant-statistics, where asymptotic theory might be involved or does not provide accurate approximations for rare variant data. Based on these P-values, combined test statistics such as the aggregated Cauchy association test (ACAT) can also be utilized. In simulation studies, we show that our framework outperforms existing approaches for family-based studies in several scenarios. We also applied our methodology to a TOPMed whole-genome sequencing dataset with 897 asthmatic trios from Costa Rica.

AVAILABILITY AND IMPLEMENTATION

FBAT software is available at https://sites.google.com/view/fbatwebpage. Simulation code is available at https://github.com/julianhecker/FBAT_rare_variant_test_simulations. Whole-genome sequencing data for 'NHLBI TOPMed: The Genetic Epidemiology of Asthma in Costa Rica' is available at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000988.v4.p1.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在基于家系的研究中分析罕见变异仍然是一项挑战。基于传递的方法对群体分层具有稳健性,但基于渐近理论对检验统计量的显著性评估可能不准确。此外,检验效能将严重依赖于检验统计量的选择以及位点的潜在遗传结构,而这些通常是未知的。

结果

在我们提出的框架中,我们利用FBAT单倍型算法在给定充分统计量的零假设下获得条件后代基因型分布。基于这种条件后代基因型分布,几乎任何关联检验统计量的显著性都可以通过模拟或精确计算来评估,而无需渐近近似。除了标准的线性负担型统计量外,这还使我们的方法能够评估其他检验统计量,如方差成分统计量、高阶批评方法和最大单变异统计量,对于这些统计量,渐近理论可能适用或对于罕见变异数据不能提供准确的近似。基于这些P值,还可以使用诸如聚合柯西关联检验(ACAT)等组合检验统计量。在模拟研究中,我们表明我们的框架在几种情况下优于现有的基于家系的研究方法。我们还将我们的方法应用于来自哥斯达黎加的897个哮喘三联体的TOPMed全基因组测序数据集。

可用性和实现

FBAT软件可在https://sites.google.com/view/fbatwebpage获取。模拟代码可在https://github.com/julianhecker/FBAT_rare_variant_test_simulations获取。“NHLBI TOPMed:哥斯达黎加哮喘的遗传流行病学”的全基因组测序数据可在https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000988.v4.p1获取。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

9
Fast and compact matching statistics analytics.快速且紧凑的匹配统计分析。
Bioinformatics. 2022 Mar 28;38(7):1838-1845. doi: 10.1093/bioinformatics/btac064.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验