• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用分类协变量进行全基因组遗传异质性发现

Genome-wide genetic heterogeneity discovery with categorical covariates.

作者信息

Llinares-López Felipe, Papaxanthos Laetitia, Bodenham Dean, Roqueiro Damian, Borgwardt Karsten

机构信息

Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.

SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.

出版信息

Bioinformatics. 2017 Jun 15;33(12):1820-1828. doi: 10.1093/bioinformatics/btx071.

DOI:10.1093/bioinformatics/btx071
PMID:28200033
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5870548/
Abstract

MOTIVATION

Genetic heterogeneity is the phenomenon that distinct genetic variants may give rise to the same phenotype. The recently introduced algorithm Fast Automatic Interval Search ( FAIS ) enables the genome-wide search of candidate regions for genetic heterogeneity in the form of any contiguous sequence of variants, and achieves high computational efficiency and statistical power. Although FAIS can test all possible genomic regions for association with a phenotype, a key limitation is its inability to correct for confounders such as gender or population structure, which may lead to numerous false-positive associations.

RESULTS

We propose FastCMH , a method that overcomes this problem by properly accounting for categorical confounders, while still retaining statistical power and computational efficiency. Experiments comparing FastCMH with FAIS and multiple kinds of burden tests on simulated data, as well as on human and Arabidopsis samples, demonstrate that FastCMH can drastically reduce genomic inflation and discover associations that are missed by standard burden tests.

AVAILABILITY AND IMPLEMENTATION

An R package fastcmh is available on CRAN and the source code can be found at: https://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/fastcmh.html.

CONTACT

felipe.llinares@bsse.ethz.ch.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

遗传异质性是指不同的遗传变异可能导致相同表型的现象。最近推出的快速自动区间搜索(FAIS)算法能够以任何连续变异序列的形式在全基因组范围内搜索遗传异质性的候选区域,并具有较高的计算效率和统计功效。尽管FAIS可以测试所有可能的基因组区域与表型的关联性,但其一个关键局限是无法校正诸如性别或群体结构等混杂因素,这可能导致大量假阳性关联。

结果

我们提出了FastCMH方法,该方法通过适当考虑分类混杂因素来克服这一问题,同时仍保留统计功效和计算效率。在模拟数据以及人类和拟南芥样本上,将FastCMH与FAIS及多种负担检验进行比较的实验表明,FastCMH可以大幅降低基因组膨胀,并发现标准负担检验遗漏的关联。

可用性与实现

一个名为fastcmh的R包可在CRAN上获取,其源代码可在以下网址找到:https://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/fastcmh.html。

联系方式

felipe.llinares@bsse.ethz.ch。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
Genome-wide genetic heterogeneity discovery with categorical covariates.利用分类协变量进行全基因组遗传异质性发现
Bioinformatics. 2017 Jun 15;33(12):1820-1828. doi: 10.1093/bioinformatics/btx071.
2
Genome-wide detection of intervals of genetic heterogeneity associated with complex traits.全基因组检测与复杂性状相关的遗传异质性区间
Bioinformatics. 2015 Jun 15;31(12):i240-9. doi: 10.1093/bioinformatics/btv263.
3
Network-guided search for genetic heterogeneity between gene pairs.网络引导的基因对间遗传异质性搜索。
Bioinformatics. 2021 Apr 9;37(1):57-65. doi: 10.1093/bioinformatics/btaa581.
4
GppFst: genomic posterior predictive simulations of FST and dXY for identifying outlier loci from population genomic data.GppFst:FST和dXY的基因组后验预测模拟,用于从群体基因组数据中识别异常位点。
Bioinformatics. 2017 May 1;33(9):1414-1415. doi: 10.1093/bioinformatics/btw795.
5
graphkernels: R and Python packages for graph comparison.图核:用于图比较的 R 和 Python 包。
Bioinformatics. 2018 Feb 1;34(3):530-532. doi: 10.1093/bioinformatics/btx602.
6
TiMEx: a waiting time model for mutually exclusive cancer alterations.TiMEx:用于相互排斥的癌症改变的等待时间模型。
Bioinformatics. 2016 Apr 1;32(7):968-75. doi: 10.1093/bioinformatics/btv400. Epub 2015 Jul 9.
7
Rediscover: an R package to identify mutually exclusive mutations.Rediscover:一个用于识别互斥突变的 R 包。
Bioinformatics. 2022 Jan 12;38(3):844-845. doi: 10.1093/bioinformatics/btab709.
8
synbreed: a framework for the analysis of genomic prediction data using R.synbreed:一个使用 R 进行基因组预测数据分析的框架。
Bioinformatics. 2012 Aug 1;28(15):2086-7. doi: 10.1093/bioinformatics/bts335. Epub 2012 Jun 10.
9
A Zoom-Focus algorithm (ZFA) to locate the optimal testing region for rare variant association tests.一种用于定位罕见变异关联测试最佳测试区域的变焦聚焦算法(ZFA)。
Bioinformatics. 2017 Aug 1;33(15):2330-2336. doi: 10.1093/bioinformatics/btx130.
10
cit: hypothesis testing software for mediation analysis in genomic applications.引用:基因组应用中介分析的假设检验软件。
Bioinformatics. 2016 Aug 1;32(15):2364-5. doi: 10.1093/bioinformatics/btw135. Epub 2016 Mar 9.

引用本文的文献

1
Higher-order genetic interaction discovery with network-based biological priors.基于网络生物学先验的高阶遗传交互作用发现。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i523-i533. doi: 10.1093/bioinformatics/btad273.
2
CALDERA: finding all significant de Bruijn subgraphs for bacterial GWAS.CALDERA:用于细菌 GWAS 的所有显著 de Bruijn 子图的发现。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i36-i44. doi: 10.1093/bioinformatics/btac238.
3
Network-guided search for genetic heterogeneity between gene pairs.网络引导的基因对间遗传异质性搜索。

本文引用的文献

1
easyGWAS: A Cloud-Based Platform for Comparing the Results of Genome-Wide Association Studies.easyGWAS:一个用于比较全基因组关联研究结果的基于云的平台。
Plant Cell. 2017 Jan;29(1):5-19. doi: 10.1105/tpc.16.00551. Epub 2016 Dec 16.
2
Genome-wide detection of intervals of genetic heterogeneity associated with complex traits.全基因组检测与复杂性状相关的遗传异质性区间
Bioinformatics. 2015 Jun 15;31(12):i240-9. doi: 10.1093/bioinformatics/btv263.
3
Rare-variant association analysis: study designs and statistical tests.罕见变异关联分析:研究设计与统计检验。
Bioinformatics. 2021 Apr 9;37(1):57-65. doi: 10.1093/bioinformatics/btaa581.
4
GWAS: Fast-forwarding gene identification and characterization in temperate Cereals: lessons from Barley - A review.全基因组关联研究:加速温带谷物基因的鉴定与表征——以大麦为例的经验教训综述
J Adv Res. 2019 Nov 4;22:119-135. doi: 10.1016/j.jare.2019.10.013. eCollection 2020 Mar.
5
Genetic Advances in Chronic Obstructive Pulmonary Disease. Insights from COPDGene.慢性阻塞性肺疾病的遗传学进展。来自 COPDGene 的见解。
Am J Respir Crit Care Med. 2019 Sep 15;200(6):677-690. doi: 10.1164/rccm.201808-1455SO.
6
CASMAP: detection of statistically significant combinations of SNPs in association mapping.CASMAP:关联作图中 SNP 统计显著组合的检测。
Bioinformatics. 2019 Aug 1;35(15):2680-2682. doi: 10.1093/bioinformatics/bty1020.
Am J Hum Genet. 2014 Jul 3;95(1):5-23. doi: 10.1016/j.ajhg.2014.06.009.
4
Risk loci for chronic obstructive pulmonary disease: a genome-wide association study and meta-analysis.慢性阻塞性肺疾病风险基因座的全基因组关联研究和荟萃分析。
Lancet Respir Med. 2014 Mar;2(3):214-25. doi: 10.1016/S2213-2600(14)70002-5. Epub 2014 Feb 7.
5
The causes and consequences of genetic heterogeneity in cancer evolution.癌症进化中遗传异质性的原因和后果。
Nature. 2013 Sep 19;501(7467):338-45. doi: 10.1038/nature12625.
6
Statistical significance of combinatorial regulations.组合调控的统计显著性。
Proc Natl Acad Sci U S A. 2013 Aug 6;110(32):12996-3001. doi: 10.1073/pnas.1302233110. Epub 2013 Jul 23.
7
A powerful and efficient set test for genetic markers that handles confounders.一种强大而有效的遗传标记集测试方法,可处理混杂因素。
Bioinformatics. 2013 Jun 15;29(12):1526-33. doi: 10.1093/bioinformatics/btt177. Epub 2013 Apr 18.
8
The nature of confounding in genome-wide association studies.全基因组关联研究中的混杂性质。
Nat Rev Genet. 2013 Jan;14(1):1-2. doi: 10.1038/nrg3382. Epub 2012 Nov 20.
9
Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines.对拟南芥近交系 107 个表型进行全基因组关联研究。
Nature. 2010 Jun 3;465(7298):627-31. doi: 10.1038/nature08800. Epub 2010 Mar 24.
10
Genetic epidemiology of COPD (COPDGene) study design.COPD(COPDGene)遗传流行病学研究设计。
COPD. 2010 Feb;7(1):32-43. doi: 10.3109/15412550903499522.