Suppr超能文献

在使用高密度单核苷酸多态性(SNP)进行定性性状连锁分析时处理连锁不平衡:一种两步策略。

Handling linkage disequilibrium in qualitative trait linkage analysis using dense SNPs: a two-step strategy.

作者信息

Cho Kelly, Dupuis Josée

机构信息

Departments of Genetics and Biostatistics, Yale University Schools of Medicine and Public Health, New Haven, CT 06520-8034, USA.

出版信息

BMC Genet. 2009 Aug 10;10:44. doi: 10.1186/1471-2156-10-44.

Abstract

BACKGROUND

In affected sibling pair linkage analysis, the presence of linkage disequilibrium (LD) has been shown to lead to overestimation of the number of alleles shared identity-by-descent (IBD) among sibling pairs when parents are ungenotyped. This inflation results in spurious evidence for linkage even when the markers and the disease locus are not linked. In our study, we first theoretically evaluate how inflation in IBD probabilities leads to overestimation of a nonparametric linkage (NPL) statistic under the assumption of linkage equilibrium. Next, we propose a two-step processing strategy in order to systematically evaluate approaches to handle LD. Based on the observed inflation of expected logarithm of the odds ratio (LOD) from our theoretical exploration, we implemented our proposed two-step processing strategy. Step 1 involves three techniques to filter a dense set of markers. In step 2, we use the selected subset of markers from step 1 and apply four different methods of handling LD among dense markers: 1) marker thinning (MT); 2) recursive elimination; 3) SNPLINK; and 4) LD modeling approach in MERLIN. We evaluate relative performance of each method through simulation.

RESULTS

We observed LOD score inflation only when the parents were ungenotyped. For a given number of markers, all approaches evaluated for each type of LD threshold performed similarly; however, RE approach was the only one that eliminated the LOD score bias. Our simulation results indicate a reduction of approximately 75% to complete elimination of the LOD score inflation while maintaining the information content (IC) when setting a tolerable squared correlation coefficient LD threshold (r2) above 0.3 for or 2 SNPs per cM using MT.

CONCLUSION

We have established a theoretical basis of how inflated IBD information among dense markers overestimates a NPL statistic. The two-step processing strategy serves as a useful framework to systematically evaluate relative performance of different methods to handle LD.

摘要

背景

在受累同胞对连锁分析中,当父母未进行基因分型时,连锁不平衡(LD)的存在已被证明会导致同胞对之间通过血缘相同(IBD)共享的等位基因数量被高估。这种膨胀即使在标记与疾病位点不连锁时也会导致连锁的虚假证据。在我们的研究中,我们首先从理论上评估在连锁平衡假设下,IBD概率的膨胀如何导致非参数连锁(NPL)统计量的高估。接下来,我们提出一种两步处理策略,以便系统地评估处理LD的方法。基于我们理论探索中观察到的期望优势比对数(LOD)的膨胀,我们实施了我们提出的两步处理策略。第一步涉及三种技术来筛选密集的标记集。在第二步中,我们使用第一步中选择的标记子集,并应用四种不同的方法来处理密集标记之间的LD:1)标记稀疏化(MT);2)递归消除;3)SNPLINK;4)MERLIN中的LD建模方法。我们通过模拟评估每种方法的相对性能。

结果

我们仅在父母未进行基因分型时观察到LOD评分膨胀。对于给定数量的标记,针对每种类型的LD阈值评估的所有方法表现相似;然而,RE方法是唯一消除LOD评分偏差的方法。我们的模拟结果表明,当使用MT将可容忍的平方相关系数LD阈值(r2)设置为高于0.3或每厘摩2个单核苷酸多态性(SNP)时,LOD评分膨胀可减少约75%至完全消除,同时保持信息含量(IC)。

结论

我们已经建立了一个理论基础,即密集标记之间膨胀的IBD信息如何高估NPL统计量。两步处理策略是一个有用的框架,可用于系统地评估处理LD的不同方法的相对性能。

相似文献

2
Handling linkage disequilibrium in linkage analysis using dense single-nucleotide polymorphisms.
BMC Proc. 2007;1 Suppl 1(Suppl 1):S161. doi: 10.1186/1753-6561-1-s1-s161. Epub 2007 Dec 18.
3
Bias of allele-sharing linkage statistics in the presence of intermarker linkage disequilibrium.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S82. doi: 10.1186/1471-2156-6-S1-S82.
9
Multi-marker linkage disequilibrium mapping of quantitative trait loci.
Brief Bioinform. 2017 Mar 1;18(2):195-204. doi: 10.1093/bib/bbw006.

引用本文的文献

2
A genomic approach to inferring kinship reveals limited intergenerational dispersal in the yellow fever mosquito.
Mol Ecol Resour. 2019 Sep;19(5):1254-1264. doi: 10.1111/1755-0998.13043. Epub 2019 Jun 12.
4
5
Genome-wide SNPs reveal the drivers of gene flow in an urban population of the Asian Tiger Mosquito, Aedes albopictus.
PLoS Negl Trop Dis. 2017 Oct 18;11(10):e0006009. doi: 10.1371/journal.pntd.0006009. eCollection 2017 Oct.
6
A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees.
Bioinformatics. 2013 Jan 15;29(2):197-205. doi: 10.1093/bioinformatics/bts658. Epub 2012 Nov 18.

本文引用的文献

4
The effect of linkage disequilibrium on linkage analysis of incomplete pedigrees.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S6. doi: 10.1186/1471-2156-6-S1-S6.
6
Whole-genome linkage analysis in mapping alcoholism genes using single-nucleotide polymorphisms and microsatellites.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S28. doi: 10.1186/1471-2156-6-S1-S28.
8
Handling marker-marker linkage disequilibrium: pedigree analysis with clustered markers.
Am J Hum Genet. 2005 Nov;77(5):754-67. doi: 10.1086/497345. Epub 2005 Sep 20.
10
SNPLINK: multipoint linkage analysis of densely distributed SNP data incorporating automated linkage disequilibrium removal.
Bioinformatics. 2005 Jul 1;21(13):3060-1. doi: 10.1093/bioinformatics/bti449. Epub 2005 Apr 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验