利用连锁不平衡和单倍型选择用于关联分析的遗传标记。

Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes.

作者信息

Meng Zhaoling, Zaykin Dmitri V, Xu Chun-Fang, Wagner Michael, Ehm Margaret G

机构信息

Bioinformatics Research Center, Campus Box 7566, North Carolina State University, Raleigh, NC 27695-7566, USA.

出版信息

Am J Hum Genet. 2003 Jul;73(1):115-30. doi: 10.1086/376561. Epub 2003 Jun 5.

DOI:10.1086/376561

PMID:12796855

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1180574/

Abstract

The genotyping of closely spaced single-nucleotide polymorphism (SNP) markers frequently yields highly correlated data, owing to extensive linkage disequilibrium (LD) between markers. The extent of LD varies widely across the genome and drives the number of frequent haplotypes observed in small regions. Several studies have illustrated the possibility that LD or haplotype data could be used to select a subset of SNPs that optimize the information retained in a genomic region while reducing the genotyping effort and simplifying the analysis. We propose a method based on the spectral decomposition of the matrices of pairwise LD between markers, and we select markers on the basis of their contributions to the total genetic variation. We also modify Clayton's "haplotype tagging SNP" selection method, which utilizes haplotype information. For both methods, we propose sliding window-based algorithms that allow the methods to be applied to large chromosomal regions. Our procedures require genotype information about a small number of individuals for an initial set of SNPs and selection of an optimum subset of SNPs that could be efficiently genotyped on larger numbers of samples while retaining most of the genetic variation in samples. We identify suitable parameter combinations for the procedures, and we show that a sample size of 50-100 individuals achieves consistent results in studies of simulated data sets in linkage equilibrium and LD. When applied to experimental data sets, both procedures were similarly effective at reducing the genotyping requirement while maintaining the genetic information content throughout the regions. We also show that haplotype-association results that Hosking et al. obtained near CYP2D6 were almost identical before and after marker selection.

摘要

由于标记之间存在广泛的连锁不平衡（LD），紧密间隔的单核苷酸多态性（SNP）标记的基因分型常常产生高度相关的数据。LD的程度在全基因组中差异很大，并决定了在小区域中观察到的常见单倍型的数量。多项研究表明，LD或单倍型数据可用于选择SNP的一个子集，该子集在减少基因分型工作并简化分析的同时，能优化保留在基因组区域中的信息。我们提出了一种基于标记间成对LD矩阵谱分解的方法，并根据标记对总遗传变异的贡献来选择标记。我们还修改了利用单倍型信息的克莱顿“单倍型标签SNP”选择方法。对于这两种方法，我们都提出了基于滑动窗口的算法，使这些方法能够应用于大的染色体区域。我们的程序需要关于一小部分个体的初始SNP集的基因型信息，并选择一个最佳的SNP子集，该子集可以在大量样本上进行高效基因分型，同时保留样本中的大部分遗传变异。我们确定了这些程序的合适参数组合，并表明在连锁平衡和LD的模拟数据集研究中，50 - 100个个体的样本量能获得一致的结果。当应用于实验数据集时，这两种程序在减少基因分型需求同时保持整个区域的遗传信息含量方面同样有效。我们还表明，霍斯金等人在CYP2D6附近获得的单倍型关联结果在标记选择前后几乎相同。

相似文献

Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes.

Am J Hum Genet. 2003 Jul;73(1):115-30. doi: 10.1086/376561. Epub 2003 Jun 5.

The impact of missing and erroneous genotypes on tagging SNP selection and power of subsequent association tests.

Hum Hered. 2006;61(1):31-44. doi: 10.1159/000092141. Epub 2006 Mar 23.

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.

Genome Res. 2004 May;14(5):908-16. doi: 10.1101/gr.1837404. Epub 2004 Apr 12.

Accounting for linkage disequilibrium among markers in linkage analysis: impact of haplotype frequency estimation and molecular haplotypes for a gene in a candidate region for Alzheimer's disease.

Hum Hered. 2007;63(1):26-34. doi: 10.1159/000098459. Epub 2007 Jan 11.

Characterization of multilocus linkage disequilibrium.

Genet Epidemiol. 2005 Apr;28(3):193-206. doi: 10.1002/gepi.20056.

Patterns of linkage disequilibrium between SNPs in a Sardinian population isolate and the selection of markers for association studies.

Hum Hered. 2008;65(1):9-22. doi: 10.1159/000106058. Epub 2007 Jul 25.

Optimal selection of SNP markers for disease association studies.

Hum Hered. 2004;58(3-4):190-202. doi: 10.1159/000083546.

The effect of single-nucleotide polymorphism marker selection on patterns of haplotype blocks and haplotype frequency estimates.

Am J Hum Genet. 2005 Dec;77(6):988-98. doi: 10.1086/498175. Epub 2005 Oct 19.

New multilocus linkage disequilibrium measure for tag SNP selection.

J Bioinform Comput Biol. 2017 Feb;15(1):1750001. doi: 10.1142/S0219720017500019.

Linkage disequilibrium and haplotype block patterns in popcorn populations.

PLoS One. 2019 Sep 25;14(9):e0219417. doi: 10.1371/journal.pone.0219417. eCollection 2019.

引用本文的文献

The polygenic implication of clopidogrel responsiveness: Insights from platelet reactivity analysis and next-generation sequencing.

PLoS One. 2024 Jul 11;19(7):e0306445. doi: 10.1371/journal.pone.0306445. eCollection 2024.

Genetic diversity and population structure analysis of a diverse panel of pea ().

Front Genet. 2024 May 30;15:1396888. doi: 10.3389/fgene.2024.1396888. eCollection 2024.

Demographic dynamics and molecular evolution of the rare and endangered subsect. Gerardianae of Pinus: insights from chloroplast genomes and mitochondrial DNA markers.

Planta. 2024 Jan 28;259(2):45. doi: 10.1007/s00425-023-04316-8.

Genetic polymorphisms as potential pharmacogenetic biomarkers for platinum-based chemotherapy in non-small cell lung cancer.

Mol Biol Rep. 2024 Jan 13;51(1):102. doi: 10.1007/s11033-023-08915-2.

Development of diagnostic SNP markers for quality assurance and control in sweetpotato [Ipomoea batatas (L.) Lam.] breeding programs.

PLoS One. 2020 Apr 24;15(4):e0232173. doi: 10.1371/journal.pone.0232173. eCollection 2020.

An evaluation of machine-learning for predicting phenotype: studies in yeast, rice, and wheat.

Mach Learn. 2020;109(2):251-277. doi: 10.1007/s10994-019-05848-5. Epub 2019 Oct 23.

A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction.

Heredity (Edinb). 2018 Apr;120(4):356-368. doi: 10.1038/s41437-017-0023-4. Epub 2017 Dec 14.

Evaluating information content of SNPs for sample-tagging in re-sequencing projects.

Sci Rep. 2015 May 15;5:10247. doi: 10.1038/srep10247.

Heavy metals, organic solvents, and multiple sclerosis: An exploratory look at gene-environment interactions.

Arch Environ Occup Health. 2016;71(1):26-34. doi: 10.1080/19338244.2014.937381. Epub 2014 Aug 19.

Genetic analysis and mapping of genes for resistance to multiple strains of Soybean mosaic virus in a single resistant soybean accession PI 96983.

Theor Appl Genet. 2013 Jul;126(7):1783-91. doi: 10.1007/s00122-013-2092-y. Epub 2013 Apr 12.

本文引用的文献

On the theory of random mating.

Ann Eugen. 1954 Mar;18(4):311-7. doi: 10.1111/j.1469-1809.1952.tb02522.x.

A first-generation linkage disequilibrium map of human chromosome 22.

Nature. 2002 Aug 1;418(6897):544-8. doi: 10.1038/nature00864. Epub 2002 Jul 10.

Linkage disequilibrium mapping identifies a 390 kb region associated with CYP2D6 poor drug metabolising activity.

Pharmacogenomics J. 2002;2(3):165-75. doi: 10.1038/sj.tpj.6500096.

A dynamic programming algorithm for haplotype block partitioning.

Proc Natl Acad Sci U S A. 2002 May 28;99(11):7335-9. doi: 10.1073/pnas.102186799.

Genomics. New mapping project splits the community.

Science. 2002 May 24;296(5572):1391-3. doi: 10.1126/science.296.5572.1391.

The structure of haplotype blocks in the human genome.

Science. 2002 Jun 21;296(5576):2225-9. doi: 10.1126/science.1069424. Epub 2002 May 23.

Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21.

Science. 2001 Nov 23;294(5547):1719-23. doi: 10.1126/science.1065573.

Haplotype tagging for the identification of common disease genes.

Nat Genet. 2001 Oct;29(2):233-7. doi: 10.1038/ng1001-233.

High-resolution haplotype structure in the human genome.

Nat Genet. 2001 Oct;29(2):229-32. doi: 10.1038/ng1001-229.

Sequence variation and linkage disequilibrium in the human T-cell receptor beta (TCRB) locus.

Am J Hum Genet. 2001 Aug;69(2):381-95. doi: 10.1086/321297. Epub 2001 Jun 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用连锁不平衡和单倍型选择用于关联分析的遗传标记。

Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes.

作者信息

Meng Zhaoling, Zaykin Dmitri V, Xu Chun-Fang, Wagner Michael, Ehm Margaret G

机构信息

Bioinformatics Research Center, Campus Box 7566, North Carolina State University, Raleigh, NC 27695-7566, USA.

出版信息

Am J Hum Genet. 2003 Jul;73(1):115-30. doi: 10.1086/376561. Epub 2003 Jun 5.

DOI:10.1086/376561

PMID:12796855

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1180574/

Abstract

摘要

利用连锁不平衡和单倍型选择用于关联分析的遗传标记。

Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用连锁不平衡和单倍型选择用于关联分析的遗传标记。

Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes.

作者信息

机构信息

出版信息