Suppr超能文献

利用分类协变量进行全基因组遗传异质性发现

Genome-wide genetic heterogeneity discovery with categorical covariates.

作者信息

Llinares-López Felipe, Papaxanthos Laetitia, Bodenham Dean, Roqueiro Damian, Borgwardt Karsten

机构信息

Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.

SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

出版信息

Bioinformatics. 2017 Jun 15;33(12):1820-1828. doi: 10.1093/bioinformatics/btx071.

Abstract

MOTIVATION

Genetic heterogeneity is the phenomenon that distinct genetic variants may give rise to the same phenotype. The recently introduced algorithm Fast Automatic Interval Search ( FAIS ) enables the genome-wide search of candidate regions for genetic heterogeneity in the form of any contiguous sequence of variants, and achieves high computational efficiency and statistical power. Although FAIS can test all possible genomic regions for association with a phenotype, a key limitation is its inability to correct for confounders such as gender or population structure, which may lead to numerous false-positive associations.

RESULTS

We propose FastCMH , a method that overcomes this problem by properly accounting for categorical confounders, while still retaining statistical power and computational efficiency. Experiments comparing FastCMH with FAIS and multiple kinds of burden tests on simulated data, as well as on human and Arabidopsis samples, demonstrate that FastCMH can drastically reduce genomic inflation and discover associations that are missed by standard burden tests.

AVAILABILITY AND IMPLEMENTATION

An R package fastcmh is available on CRAN and the source code can be found at: https://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/fastcmh.html.

CONTACT

felipe.llinares@bsse.ethz.ch.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

遗传异质性是指不同的遗传变异可能导致相同表型的现象。最近推出的快速自动区间搜索(FAIS)算法能够以任何连续变异序列的形式在全基因组范围内搜索遗传异质性的候选区域,并具有较高的计算效率和统计功效。尽管FAIS可以测试所有可能的基因组区域与表型的关联性,但其一个关键局限是无法校正诸如性别或群体结构等混杂因素,这可能导致大量假阳性关联。

结果

我们提出了FastCMH方法,该方法通过适当考虑分类混杂因素来克服这一问题,同时仍保留统计功效和计算效率。在模拟数据以及人类和拟南芥样本上,将FastCMH与FAIS及多种负担检验进行比较的实验表明,FastCMH可以大幅降低基因组膨胀,并发现标准负担检验遗漏的关联。

可用性与实现

一个名为fastcmh的R包可在CRAN上获取,其源代码可在以下网址找到:https://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/fastcmh.html。

联系方式

felipe.llinares@bsse.ethz.ch

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验