Suppr超能文献

PowerBacGWAS:用于进行细菌全基因组关联研究的计算管道,以进行功效计算。

PowerBacGWAS: a computational pipeline to perform power calculations for bacterial genome-wide association studies.

机构信息

Department of Infection Biology, Faculty of Infectious & Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK.

Department of Medicine, University of Cambridge, Cambridge, UK.

出版信息

Commun Biol. 2022 Mar 25;5(1):266. doi: 10.1038/s42003-022-03194-2.

Abstract

Genome-wide association studies (GWAS) are increasingly being applied to investigate the genetic basis of bacterial traits. However, approaches to perform power calculations for bacterial GWAS are limited. Here we implemented two alternative approaches to conduct power calculations using existing collections of bacterial genomes. First, a sub-sampling approach was undertaken to reduce the allele frequency and effect size of a known and detectable genotype-phenotype relationship by modifying phenotype labels. Second, a phenotype-simulation approach was conducted to simulate phenotypes from existing genetic variants. We implemented both approaches into a computational pipeline (PowerBacGWAS) that supports power calculations for burden testing, pan-genome and variant GWAS; and applied it to collections of Enterococcus faecium, Klebsiella pneumoniae and Mycobacterium tuberculosis. We used this pipeline to determine sample sizes required to detect causal variants of different minor allele frequencies (MAF), effect sizes and phenotype heritability, and studied the effect of homoplasy and population diversity on the power to detect causal variants. Our pipeline and user documentation are made available and can be applied to other bacterial populations. PowerBacGWAS can be used to determine sample sizes required to find statistically significant associations, or the associations detectable with a given sample size. We recommend to perform power calculations using existing genomes of the bacterial species and population of study.

摘要

全基因组关联研究(GWAS)越来越多地被应用于研究细菌特征的遗传基础。然而,用于进行细菌 GWAS 功效计算的方法有限。在这里,我们实施了两种替代方法,使用现有的细菌基因组集合来进行功效计算。首先,通过修改表型标签,采用抽样方法降低已知和可检测基因型-表型关系的等位基因频率和效应大小。其次,进行表型模拟方法来模拟来自现有遗传变异的表型。我们将这两种方法实现到一个计算管道(PowerBacGWAS)中,该管道支持负担测试、泛基因组和变体 GWAS 的功效计算;并将其应用于屎肠球菌、肺炎克雷伯菌和结核分枝杆菌的集合。我们使用该管道来确定检测不同次要等位基因频率(MAF)、效应大小和表型遗传力的因果变异所需的样本量,并研究同形性和群体多样性对检测因果变异的功效的影响。我们的管道和用户文档是可用的,并可应用于其他细菌群体。PowerBacGWAS 可用于确定找到具有统计学意义关联所需的样本量,或使用给定样本量可检测到的关联。我们建议使用研究的细菌物种和群体的现有基因组进行功效计算。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fa8/8956664/c0fc4bddd284/42003_2022_3194_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验