• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于隐马尔可夫随机场模型的全基因组关联研究。

A hidden Markov random field model for genome-wide association studies.

机构信息

Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA.

出版信息

Biostatistics. 2010 Jan;11(1):139-50. doi: 10.1093/biostatistics/kxp043. Epub 2009 Oct 12.

DOI:10.1093/biostatistics/kxp043
PMID:19822692
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2800164/
Abstract

Genome-wide association studies (GWAS) are increasingly utilized for identifying novel susceptible genetic variants for complex traits, but there is little consensus on analysis methods for such data. Most commonly used methods include single single nucleotide polymorphism (SNP) analysis or haplotype analysis with Bonferroni correction for multiple comparisons. Since the SNPs in typical GWAS are often in linkage disequilibrium (LD), at least locally, Bonferroni correction of multiple comparisons often leads to conservative error control and therefore lower statistical power. In this paper, we propose a hidden Markov random field model (HMRF) for GWAS analysis based on a weighted LD graph built from the prior LD information among the SNPs and an efficient iterative conditional mode algorithm for estimating the model parameters. This model effectively utilizes the LD information in calculating the posterior probability that an SNP is associated with the disease. These posterior probabilities can then be used to define a false discovery controlling procedure in order to select the disease-associated SNPs. Simulation studies demonstrated the potential gain in power over single SNP analysis. The proposed method is especially effective in identifying SNPs with borderline significance at the single-marker level that nonetheless are in high LD with significant SNPs. In addition, by simultaneously considering the SNPs in LD, the proposed method can also help to reduce the number of false identifications of disease-associated SNPs. We demonstrate the application of the proposed HMRF model using data from a case-control GWAS of neuroblastoma and identify 1 new SNP that is potentially associated with neuroblastoma.

摘要

全基因组关联研究(GWAS)越来越多地用于识别复杂性状的新型易感遗传变异,但对于此类数据的分析方法尚未达成共识。最常用的方法包括单核苷酸多态性(SNP)分析或单体型分析,并对多重比较进行 Bonferroni 校正。由于典型 GWAS 中的 SNPs 通常处于连锁不平衡(LD)状态,至少在局部区域,多重比较的 Bonferroni 校正通常会导致保守的误差控制,从而降低统计功效。在本文中,我们提出了一种基于从 SNPs 之间的先验 LD 信息构建的加权 LD 图的 GWAS 分析隐马尔可夫随机场模型(HMRF),以及用于估计模型参数的高效迭代条件模式算法。该模型有效地利用了 LD 信息来计算 SNP 与疾病相关的后验概率。这些后验概率可用于定义错误发现控制程序,以选择与疾病相关的 SNPs。模拟研究表明,该方法在单 SNP 分析方面具有潜在的功效增益。该方法在识别单标记水平上具有边缘意义但与显著 SNPs 高度 LD 的 SNPs 方面特别有效。此外,通过同时考虑 LD 中的 SNPs,该方法还可以帮助减少假阳性识别与疾病相关的 SNPs 的数量。我们使用神经母细胞瘤病例对照 GWAS 的数据展示了所提出的 HMRF 模型的应用,并鉴定出 1 个可能与神经母细胞瘤相关的新 SNP。

相似文献

1
A hidden Markov random field model for genome-wide association studies.基于隐马尔可夫随机场模型的全基因组关联研究。
Biostatistics. 2010 Jan;11(1):139-50. doi: 10.1093/biostatistics/kxp043. Epub 2009 Oct 12.
2
Hidden Markov models for controlling false discovery rate in genome-wide association analysis.用于全基因组关联分析中控制错误发现率的隐马尔可夫模型
Methods Mol Biol. 2012;802:337-44. doi: 10.1007/978-1-61779-400-1_22.
3
Performance of a blockwise approach in variable selection using linkage disequilibrium information.使用连锁不平衡信息进行变量选择时的分块方法性能。
BMC Bioinformatics. 2015 May 8;16:148. doi: 10.1186/s12859-015-0556-6.
4
Selecting Closely-Linked SNPs Based on Local Epistatic Effects for Haplotype Construction Improves Power of Association Mapping.基于局部上位效应选择紧密连锁 SNPs 进行单倍型构建可提高关联作图的功效。
G3 (Bethesda). 2019 Dec 3;9(12):4115-4126. doi: 10.1534/g3.119.400451.
5
Performance of random forest when SNPs are in linkage disequilibrium.单核苷酸多态性处于连锁不平衡状态时随机森林的性能。
BMC Bioinformatics. 2009 Mar 5;10:78. doi: 10.1186/1471-2105-10-78.
6
Exploiting genome structure in association analysis.在关联分析中利用基因组结构
J Comput Biol. 2014 Apr;21(4):345-60. doi: 10.1089/cmb.2009.0224. Epub 2011 May 6.
7
Bayesian epistasis association mapping via SNP imputation.贝叶斯上位性关联映射通过 SNP 插补。
Biostatistics. 2011 Apr;12(2):211-22. doi: 10.1093/biostatistics/kxq063. Epub 2010 Oct 5.
8
Detecting genetic association through shortest paths in a bidirected graph.通过双向图中的最短路径检测基因关联。
Genet Epidemiol. 2017 Sep;41(6):481-497. doi: 10.1002/gepi.22051. Epub 2017 Jun 19.
9
Tagging SNP-set selection with maximum information based on linkage disequilibrium structure in genome-wide association studies.基于全基因组关联研究中连锁不平衡结构的最大信息进行 SNP 集选择标记。
Bioinformatics. 2017 Jul 15;33(14):2078-2081. doi: 10.1093/bioinformatics/btx151.
10
HapBoost: a fast approach to boosting haplotype association analyses in genome-wide association studies.HapBoost:一种用于全基因组关联研究中提升单体型关联分析的快速方法。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):207-12. doi: 10.1109/TCBB.2013.6.

引用本文的文献

1
A framework to infer exonic variants when parental genotypes are missing enhances association studies of autism.一种在亲本基因型缺失时推断外显子变异的框架增强了自闭症的关联研究。
bioRxiv. 2025 Jul 24:2025.07.24.666675. doi: 10.1101/2025.07.24.666675.
2
Network assisted analysis of de novo variants using protein-protein interaction information identified 46 candidate genes for congenital heart disease.利用蛋白质-蛋白质相互作用信息进行从头变异的网络辅助分析,确定了 46 个先天性心脏病候选基因。
PLoS Genet. 2022 Jun 7;18(6):e1010252. doi: 10.1371/journal.pgen.1010252. eCollection 2022 Jun.
3
A Markov random field model for network-based differential expression analysis of single-cell RNA-seq data.基于马尔可夫随机场模型的单细胞 RNA-seq 数据的网络差异表达分析。
BMC Bioinformatics. 2021 Oct 26;22(1):524. doi: 10.1186/s12859-021-04412-0.
4
Statistical Identification of Important Nodes in Biological Systems.生物系统中重要节点的统计识别
J Syst Sci Complex. 2021;34(4):1454-1470. doi: 10.1007/s11424-020-0013-0. Epub 2021 Aug 10.
5
Statistical Identification of Important Nodes in Biological Systems.生物系统中重要节点的统计识别
J Syst Sci Complex. 2021 Jan 12:1-17. doi: 10.1007/s11424-021-0001-2.
6
Bayesian Hidden Markov Models for Dependent Large-Scale Multiple Testing.用于相关大规模多重检验的贝叶斯隐马尔可夫模型
Comput Stat Data Anal. 2019 Aug;136:123-136. doi: 10.1016/j.csda.2019.01.009. Epub 2019 Jan 29.
7
A Bayesian approach to identify genes and gene-level SNP aggregates in a genetic analysis of cancer data.一种用于在癌症数据遗传分析中识别基因和基因水平SNP聚合体的贝叶斯方法。
Stat Interface. 2015;8(2):137-151. doi: 10.4310/SII.2015.v8.n2.a2.
8
NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM.网络辅助分析揭示自闭症的遗传基础。
Ann Appl Stat. 2015;9(3):1571-1600. doi: 10.1214/15-AOAS844. Epub 2015 Nov 2.
9
A Markov random field-based approach for joint estimation of differentially expressed genes in mouse transcriptome data.一种基于马尔可夫随机场的方法,用于联合估计小鼠转录组数据中的差异表达基因。
Stat Appl Genet Mol Biol. 2016 Apr;15(2):139-50. doi: 10.1515/sagmb-2015-0070.
10
A MARKOV RANDOM FIELD-BASED APPROACH TO CHARACTERIZING HUMAN BRAIN DEVELOPMENT USING SPATIAL-TEMPORAL TRANSCRIPTOME DATA.一种基于马尔可夫随机场的方法,利用时空转录组数据表征人类大脑发育
Ann Appl Stat. 2015 Mar;9(1):429-451. doi: 10.1214/14-AOAS802.

本文引用的文献

1
False Discovery Control in Large-Scale Spatial Multiple Testing.大规模空间多重检验中的错误发现控制
J R Stat Soc Series B Stat Methodol. 2015 Jan 1;77(1):59-83. doi: 10.1111/rssb.12064.
2
Identification of ALK as a major familial neuroblastoma predisposition gene.将ALK鉴定为主要的家族性神经母细胞瘤易感基因。
Nature. 2008 Oct 16;455(7215):930-5. doi: 10.1038/nature07261. Epub 2008 Aug 24.
3
Chromosome 6p22 locus associated with clinically aggressive neuroblastoma.与临床侵袭性神经母细胞瘤相关的6号染色体p22位点。
N Engl J Med. 2008 Jun 12;358(24):2585-93. doi: 10.1056/NEJMoa0708698. Epub 2008 May 7.
4
Increasing power in association studies by using linkage disequilibrium structure and molecular function as prior information.通过将连锁不平衡结构和分子功能作为先验信息来提高关联研究的效能。
Genome Res. 2008 Apr;18(4):653-60. doi: 10.1101/gr.072785.107. Epub 2008 Mar 18.
5
Haplotypic analysis of Wellcome Trust Case Control Consortium data.威康信托病例对照研究联盟数据的单倍型分析。
Hum Genet. 2008 Apr;123(3):273-80. doi: 10.1007/s00439-008-0472-1. Epub 2008 Jan 26.
6
Incorporating gene networks into statistical tests for genomic data via a spatially correlated mixture model.通过空间相关混合模型将基因网络纳入基因组数据的统计测试。
Bioinformatics. 2008 Feb 1;24(3):404-11. doi: 10.1093/bioinformatics/btm612. Epub 2007 Dec 14.
7
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.对14000例七种常见疾病患者及3000例共享对照进行全基因组关联研究。
Nature. 2007 Jun 7;447(7145):661-78. doi: 10.1038/nature05911.
8
Detecting haplotype effects in genomewide association studies.在全基因组关联研究中检测单倍型效应
Genet Epidemiol. 2007 Dec;31(8):803-12. doi: 10.1002/gepi.20242.
9
A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer.一项全基因组关联研究确定了FGFR2基因中的等位基因与散发性绝经后乳腺癌风险相关。
Nat Genet. 2007 Jul;39(7):870-4. doi: 10.1038/ng2075. Epub 2007 May 27.
10
A Markov random field model for network-based analysis of genomic data.一种用于基于网络的基因组数据分析的马尔可夫随机场模型。
Bioinformatics. 2007 Jun 15;23(12):1537-44. doi: 10.1093/bioinformatics/btm129. Epub 2007 May 5.