Suppr超能文献

TEAM:人类全基因组关联研究中高效的双位点上位性检验。

TEAM: efficient two-locus epistasis tests in human genome-wide association study.

机构信息

Department of Computer Science, University of North Carolina at Chapel Hill, USA.

出版信息

Bioinformatics. 2010 Jun 15;26(12):i217-27. doi: 10.1093/bioinformatics/btq186.

Abstract

As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach.

摘要

作为一种用于识别表型差异背后遗传标记的有前途的工具,全基因组关联研究(GWAS)近年来得到了广泛研究。在 GWAS 中,检测上位性(或基因-基因相互作用)优于单基因座研究,因为许多疾病被认为是复杂性状。由于计算负担过重,在全基因组范围内进行上位性检测的穷举搜索是不可行的。现有的上位性检测算法是为包含纯合标记和小样本量的数据集设计的。然而,在人类研究中,基因型可能是杂合的,个体数量可以多达数千。因此,现有的方法不适用于人类数据集。在本文中,我们提出了一种高效的算法 TEAM,它大大加快了人类 GWAS 中的上位性检测。我们的算法是详尽的,即它不会忽略任何上位性相互作用。利用最小生成树结构,该算法在不扫描所有个体的情况下,逐步更新用于上位性检验的列联表。我们的算法具有更广泛的适用性,并且在大样本研究中比现有方法更有效。它支持任何基于列联表的统计检验,并能够控制总体错误率和假发现率。广泛的实验表明,我们的算法只需要检查一小部分个体来更新列联表,并且它比穷举方法至少快一个数量级。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c72/2881371/9517c26fb3db/btq186f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验