单倍型块划分作为SNP关联研究中降维的一种工具。

Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies.

作者信息

Pattaro Cristian, Ruczinski Ingo, Fallin Danièle M, Parmigiani Giovanni

机构信息

Unit of Genetic Epidemiology and Biostatistics, Institute of Genetic Medicine, European Academy, Viale Druso 1, I-39100, Bolzano, Italy.

出版信息

BMC Genomics. 2008 Aug 29;9:405. doi: 10.1186/1471-2164-9-405.

DOI:10.1186/1471-2164-9-405

PMID:18759977

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2547855/

Abstract

BACKGROUND

Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial correlation of SNPs along the genome. With the goal of identifying true genetic associations, partitioning the genome according to spatial correlation can be a powerful and meaningful way to address this dimensionality problem.

RESULTS

We developed and validated an MCMC Algorithm To Identify blocks of Linkage DisEquilibrium (MATILDE) for clustering contiguous SNPs, and a statistical testing framework to detect association using partitions as units of analysis. We compared its ability to detect true SNP associations to that of the most commonly used algorithm for block partitioning, as implemented in the Haploview and HapBlock software. Simulations were based on artificially assigning phenotypes to individuals with SNPs corresponding to region 14q11 of the HapMap database. When block partitioning is performed using MATILDE, the ability to correctly identify a disease SNP is higher, especially for small effects, than it is with the alternatives considered. Advantages can be both in terms of true positive findings and limiting the number of false discoveries. Finer partitions provided by LD-based methods or by marker-by-marker analysis are efficient only for detecting big effects, or in presence of large sample sizes. The probabilistic approach we propose offers several additional advantages, including: a) adapting the estimation of blocks to the population, technology, and sample size of the study; b) probabilistic assessment of uncertainty about block boundaries and about whether any two SNPs are in the same block; c) user selection of the probability threshold for assigning SNPs to the same block.

CONCLUSION

We demonstrate that, in realistic scenarios, our adaptive, study-specific block partitioning approach is as or more efficient than currently available LD-based approaches in guiding the search for disease loci.

摘要

背景

在关联研究中，疾病相关基因的识别受到大量分型单核苷酸多态性（SNP）的挑战。为应对高维度导致的效能稀释，并产生具有生物学可解释性的结果，考虑SNP在基因组上的空间相关性至关重要。为了识别真正的基因关联，根据空间相关性对基因组进行划分可能是解决这一维度问题的有效且有意义的方法。

结果

我们开发并验证了一种用于识别连锁不平衡块的MCMC算法（MATILDE），用于对相邻SNP进行聚类，并开发了一个统计检验框架，以分区作为分析单位来检测关联。我们将其检测真实SNP关联的能力与Haploview和HapBlock软件中实现的最常用的块划分算法进行了比较。模拟基于将表型人工分配给具有与HapMap数据库14q11区域相对应的SNP的个体。当使用MATILDE进行块划分时，与所考虑的其他方法相比，正确识别疾病SNP的能力更高，尤其是对于小效应的情况。优势体现在真阳性发现以及限制假发现数量方面。基于连锁不平衡的方法或逐个标记分析提供的更精细分区仅在检测大效应或样本量较大时才有效。我们提出的概率方法还具有其他几个优点，包括：a）使块的估计适应研究的人群、技术和样本量；b）对块边界以及任意两个SNP是否在同一块中的不确定性进行概率评估；c）用户可选择将SNP分配到同一块的概率阈值。

结论

我们证明，在实际场景中，我们的适应性、针对特定研究的块划分方法在指导疾病基因座搜索方面与目前可用的基于连锁不平衡的方法一样有效或更有效。

相似文献

Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies.单倍型块划分作为SNP关联研究中降维的一种工具。

BMC Genomics. 2008 Aug 29;9:405. doi: 10.1186/1471-2164-9-405.

HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms.HapBlock：一种使用一组动态规划算法进行单倍型块划分和标签单核苷酸多态性选择的软件。

Bioinformatics. 2005 Jan 1;21(1):131-4. doi: 10.1093/bioinformatics/bth482. Epub 2004 Aug 27.

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.利用基因型数据进行单倍型块划分和标签单核苷酸多态性选择及其在关联研究中的应用。

Genome Res. 2004 May;14(5):908-16. doi: 10.1101/gr.1837404. Epub 2004 Apr 12.

Haplotype-based quantitative trait mapping using a clustering algorithm.使用聚类算法的基于单倍型的数量性状定位

BMC Bioinformatics. 2006 May 18;7:258. doi: 10.1186/1471-2105-7-258.

Analysis of concordance of different haplotype block partitioning algorithms.不同单倍型块划分算法的一致性分析

BMC Bioinformatics. 2005 Dec 15;6:303. doi: 10.1186/1471-2105-6-303.

A new haplotype block detection method for dense genome sequencing data based on interval graph modeling of clusters of highly correlated SNPs.基于高度相关 SNPs 簇的区间图建模的密集基因组测序数据新型单倍型块检测方法。

Bioinformatics. 2018 Feb 1;34(3):388-397. doi: 10.1093/bioinformatics/btx609.

Haplotype block structure and its applications to association studies: power and study designs.单倍型块结构及其在关联研究中的应用：效能与研究设计

Am J Hum Genet. 2002 Dec;71(6):1386-94. doi: 10.1086/344780. Epub 2002 Nov 18.

Efficient haplotype block partitioning and tag SNP selection algorithms under various constraints.各种约束条件下的高效单倍型块划分及标签单核苷酸多态性选择算法。

Biomed Res Int. 2013;2013:984014. doi: 10.1155/2013/984014. Epub 2013 Nov 11.

Power-based, phase-informed selection of single nucleotide polymorphisms for disease association screens.基于功效、相位信息的单核苷酸多态性选择用于疾病关联筛查。

Genet Epidemiol. 2006 Sep;30(6):459-70. doi: 10.1002/gepi.20159.

Comparative study for haplotype block partitioning methods - Evidence from chromosome 6 of the North American Rheumatoid Arthritis Consortium (NARAC) dataset.单体型块划分方法的比较研究——来自北美类风湿关节炎联盟（NARAC）数据集 6 号染色体的证据。

PLoS One. 2018 Dec 31;13(12):e0209603. doi: 10.1371/journal.pone.0209603. eCollection 2018.

引用本文的文献

Clinical and Metabolic Signatures of - Haplotypes in a General Population Sample.普通人群样本中 - 单倍型的临床和代谢特征

Kidney Int Rep. 2025 Feb 25;10(5):1495-1508. doi: 10.1016/j.ekir.2025.02.018. eCollection 2025 May.

Determinant-based grouping of SNPs and its application for detecting disease-associated genomic loci.基于决定因素的单核苷酸多态性分组及其在检测疾病相关基因组位点中的应用。

NAR Genom Bioinform. 2025 Mar 18;7(1):lqaf024. doi: 10.1093/nargab/lqaf024. eCollection 2025 Mar.

Genomic prediction within and across maize landrace derived populations using haplotypes.利用单倍型对玉米地方品种衍生群体内部及群体间进行基因组预测。

Front Plant Sci. 2024 Mar 22;15:1351466. doi: 10.3389/fpls.2024.1351466. eCollection 2024.

Evaluation of Density-Based Spatial Clustering for Identifying Genomic Loci Associated with Ischemic Stroke in Genome-Wide Data.基于密度的空间聚类在全基因组数据中识别与缺血性脑卒中相关的基因组位点的评估。

Int J Mol Sci. 2023 Oct 19;24(20):15355. doi: 10.3390/ijms242015355.

SNP- and haplotype-based single-step genomic predictions for body weight, wool, and reproductive traits in North American Rambouillet sheep.基于 SNP 和单倍型的北美罗姆尼羊体重、羊毛和繁殖性状的一步基因组预测。

J Anim Breed Genet. 2023 Mar;140(2):216-234. doi: 10.1111/jbg.12748. Epub 2022 Nov 21.

Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure.充分利用 SNP 阵列：提取潜在基因组结构的工具的系统评价。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac043.

HaploBlocker: Creation of Subgroup-Specific Haplotype Blocks and Libraries.HaploBlocker：亚群特异性单倍型块和文库的创建。

Genetics. 2019 Aug;212(4):1045-1061. doi: 10.1534/genetics.119.302283. Epub 2019 May 31.

gpart: human genome partitioning and visualization of high-density SNP data by identifying haplotype blocks.gpart：通过识别单倍型块对高密度 SNP 数据进行人类基因组分区和可视化。

Bioinformatics. 2019 Nov 1;35(21):4419-4421. doi: 10.1093/bioinformatics/btz308.

Studying the effects of haplotype partitioning methods on the RA-associated genomic results from the North American Rheumatoid Arthritis Consortium (NARAC) dataset.研究单倍型划分方法对来自北美类风湿关节炎协会（NARAC）数据集的类风湿关节炎相关基因组结果的影响。

J Adv Res. 2019 Jan 18;18:113-126. doi: 10.1016/j.jare.2019.01.006. eCollection 2019 Jul.

PLoS One. 2018 Dec 31;13(12):e0209603. doi: 10.1371/journal.pone.0209603. eCollection 2018.

本文引用的文献

Cumulative association of five genetic variants with prostate cancer.五种基因变异与前列腺癌的累积关联。

N Engl J Med. 2008 Feb 28;358(9):910-9. doi: 10.1056/NEJMoa075819. Epub 2008 Jan 16.

Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24.全基因组关联研究在8q24区域鉴定出第二个前列腺癌易感变异位点。

Nat Genet. 2007 May;39(5):631-7. doi: 10.1038/ng1999. Epub 2007 Apr 1.

The Interaction of Selection and Linkage. I. General Considerations; Heterotic Models.选择与连锁的相互作用。I. 一般考量；杂种优势模型。

Genetics. 1964 Jan;49(1):49-67. doi: 10.1093/genetics/49.1.49.

iHAP--integrated haplotype analysis pipeline for characterizing the haplotype structure of genes.iHAP——用于描述基因单倍型结构的综合单倍型分析流程

BMC Bioinformatics. 2006 Dec 1;7:525. doi: 10.1186/1471-2105-7-525.

Volume measures for linkage disequilibrium.连锁不平衡的体积测量

BMC Genet. 2006 Nov 17;7:54. doi: 10.1186/1471-2156-7-54.

A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.一种用于大规模群体基因型数据的快速灵活统计模型：在推断缺失基因型和单倍型相位中的应用。

Am J Hum Genet. 2006 Apr;78(4):629-44. doi: 10.1086/502802. Epub 2006 Feb 17.

A fine-scale linkage-disequilibrium measure based on length of haplotype sharing.一种基于单倍型共享长度的精细尺度连锁不平衡度量。

Am J Hum Genet. 2006 Apr;78(4):615-28. doi: 10.1086/502632. Epub 2006 Feb 13.

Modeling haplotype block variation using Markov chains.使用马尔可夫链对单倍型块变异进行建模。

Genetics. 2006 Apr;172(4):2583-99. doi: 10.1534/genetics.105.042978. Epub 2005 Dec 15.

Analysis of concordance of different haplotype block partitioning algorithms.不同单倍型块划分算法的一致性分析

BMC Bioinformatics. 2005 Dec 15;6:303. doi: 10.1186/1471-2105-6-303.

A statistical framework for haplotype block inference.一种用于单倍型块推断的统计框架。

J Bioinform Comput Biol. 2005 Oct;3(5):1021-38. doi: 10.1142/s021972000500151x.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

单倍型块划分作为SNP关联研究中降维的一种工具。

Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献