样本量对单倍型块识别的影响评估。

Evaluation of sample size effect on the identification of haplotype blocks.

作者信息

Osabe Dai, Tanahashi Toshihito, Nomura Kyoko, Shinohara Shuichi, Nakamura Naoto, Yoshikawa Toshikazu, Shiota Hiroshi, Keshavarz Parvaneh, Yamaguchi Yuka, Kunika Kiyoshi, Moritani Maki, Inoue Hiroshi, Itakura Mitsuo

机构信息

Department of Bioinformatics, Division of Life Science Systems, Fujitsu Limited, Higashishinbashi, Minato-ku, Tokyo, Japan.

出版信息

BMC Bioinformatics. 2007 Jun 14;8:200. doi: 10.1186/1471-2105-8-200.

DOI:10.1186/1471-2105-8-200

PMID:17567919

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1913927/

Abstract

BACKGROUND

Genome-wide maps of linkage disequilibrium (LD) and haplotypes have been created for different populations. Substantial sharing of the boundaries and haplotypes among populations was observed, but haplotype variations have also been reported across populations. Conflicting observations on the extent and distribution of haplotypes require careful examination. The mechanisms that shape haplotypes have not been fully explored, although the effect of sample size has been implicated. We present a close examination of the effect of sample size on haplotype blocks using an original computational simulation.

RESULTS

A region spanning 19.31 Mb on chromosome 20q was genotyped for 1,147 SNPs in 725 Japanese subjects. One region of 445 kb exhibiting a single strong LD value (average |D'|; 0.94) was selected for the analysis of sample size effect on haplotype structure. Three different block definitions (recombination-based, LD-based, and diversity-based) were exploited to create simulations for block identification with theta value from real genotyping data. As a result, it was quite difficult to estimate a haplotype block for data with less than 200 samples. Attainment of a reliable haplotype structure with 50 samples was not possible, although the simulation was repeated 10,000 times.

CONCLUSION

These analyses underscored the difficulties of estimating haplotype blocks. To acquire a reliable result, it would be necessary to increase sample size more than 725 and to repeat the simulation 3,000 times. Even in one genomic region showing a high LD value, the haplotype block might be fragile. We emphasize the importance of applying careful confidence measures when using the estimated haplotype structure in biomedical research.

摘要

背景

已针对不同人群构建了全基因组连锁不平衡（LD）图谱和单倍型图谱。观察到不同人群之间在边界和单倍型上有大量共享，但也有报道称不同人群间存在单倍型变异。关于单倍型范围和分布的相互矛盾的观察结果需要仔细研究。尽管样本量的影响已被提及，但塑造单倍型的机制尚未得到充分探索。我们使用原始的计算模拟对样本量对单倍型块的影响进行了仔细研究。

结果

对725名日本受试者的20号染色体上跨度为19.31 Mb的区域进行了1147个单核苷酸多态性（SNP）的基因分型。选择了一个445 kb的区域，该区域呈现单一较强的LD值（平均|D'|；0.94），用于分析样本量对单倍型结构的影响。利用三种不同的块定义（基于重组、基于LD和基于多样性），根据实际基因分型数据的θ值创建用于块识别的模拟。结果表明，对于样本量少于200的数据，很难估计单倍型块。尽管模拟重复了10000次，但使用50个样本无法获得可靠的单倍型结构。

结论

这些分析强调了估计单倍型块的困难。为了获得可靠的结果，有必要将样本量增加到725以上，并将模拟重复3000次。即使在一个显示高LD值的基因组区域，单倍型块可能也很脆弱。我们强调在生物医学研究中使用估计的单倍型结构时应用谨慎的置信度测量的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4e8/1913927/f0f23b369ea4/1471-2105-8-200-1.jpg

相似文献

Evaluation of sample size effect on the identification of haplotype blocks.样本量对单倍型块识别的影响评估。

BMC Bioinformatics. 2007 Jun 14;8:200. doi: 10.1186/1471-2105-8-200.

Linkage disequilibrium and haplotype block patterns in popcorn populations.连锁不平衡和爆米花群体中的单倍型块模式。

PLoS One. 2019 Sep 25;14(9):e0219417. doi: 10.1371/journal.pone.0219417. eCollection 2019.

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.利用基因型数据进行单倍型块划分和标签单核苷酸多态性选择及其在关联研究中的应用。

Genome Res. 2004 May;14(5):908-16. doi: 10.1101/gr.1837404. Epub 2004 Apr 12.

Haplotype block linkage disequilibrium mapping.单倍型块连锁不平衡定位

Front Biosci. 2003 May 1;8:a85-93. doi: 10.2741/919.

Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping.通过单倍型共享的衰减评估连锁不平衡及其在精细尺度基因定位中的应用。

Am J Hum Genet. 1999 Sep;65(3):858-75. doi: 10.1086/302537.

High density linkage disequilibrium mapping using models of haplotype block variation.使用单倍型块变异模型进行高密度连锁不平衡作图。

Bioinformatics. 2004 Aug 4;20 Suppl 1:i137-44. doi: 10.1093/bioinformatics/bth907.

Haplotype structure, LD blocks, and uneven recombination within the LRP5 gene.LRP5基因内的单倍型结构、连锁不平衡（LD）块及不均等重组

Genome Res. 2003 May;13(5):845-55. doi: 10.1101/gr.563703.

Haplotype block structure and its applications to association studies: power and study designs.单倍型块结构及其在关联研究中的应用：效能与研究设计

Am J Hum Genet. 2002 Dec;71(6):1386-94. doi: 10.1086/344780. Epub 2002 Nov 18.

Nucleotide diversity and haplotype structure of the human angiotensinogen gene in two populations.两个人群中人类血管紧张素原基因的核苷酸多样性和单倍型结构

Am J Hum Genet. 2002 Jan;70(1):108-23. doi: 10.1086/338454. Epub 2001 Nov 30.

HaploBlockFinder: haplotype block analyses.单倍型块查找器：单倍型块分析

Bioinformatics. 2003 Jul 1;19(10):1300-1. doi: 10.1093/bioinformatics/btg142.

引用本文的文献

Genome-Wide Association Analysis Identifies Candidate Loci for Callus Induction in Rice ( L.).全基因组关联分析鉴定水稻（Oryza sativa L.）愈伤组织诱导的候选基因座

Plants (Basel). 2024 Jul 30;13(15):2112. doi: 10.3390/plants13152112.

Genome-to-phenome research in rats: progress and perspectives.大鼠的基因组到表型组研究：进展与展望。

Int J Biol Sci. 2021 Jan 1;17(1):119-133. doi: 10.7150/ijbs.51628. eCollection 2021.

Plant-ImputeDB: an integrated multiple plant reference panel database for genotype imputation.植物 imputeDB：一个集成的多植物参考面板数据库，用于基因型推断。

Nucleic Acids Res. 2021 Jan 8;49(D1):D1480-D1488. doi: 10.1093/nar/gkaa953.

Variation block-based genomics method for crop plants.基于变异块的作物基因组学方法。

BMC Genomics. 2014 Jun 15;15:477. doi: 10.1186/1471-2164-15-477.

Eur J Hum Genet. 2009 Jun;17(6):802-10. doi: 10.1038/ejhg.2008.248. Epub 2009 Jan 7.

本文引用的文献

Linkage disequilibrium in finite populations.有限群体中的连锁不平衡。

Theor Appl Genet. 1968 Jun;38(6):226-31. doi: 10.1007/BF01245622.

The Interaction of Selection and Linkage. I. General Considerations; Heterotic Models.选择与连锁的相互作用。I. 一般考量；杂种优势模型。

Genetics. 1964 Jan;49(1):49-67. doi: 10.1093/genetics/49.1.49.

Association study on chromosome 20q11.21-13.13 locus and its contribution to type 2 diabetes susceptibility in Japanese.日本人群中20号染色体q11.21 - 13.13位点与2型糖尿病易感性的关联研究及其贡献

Hum Genet. 2006 Nov;120(4):527-42. doi: 10.1007/s00439-006-0231-0. Epub 2006 Sep 6.

Association of single-nucleotide polymorphisms in the suppressor of cytokine signaling 2 (SOCS2) gene with type 2 diabetes in the Japanese.

Genomics. 2006 Apr;87(4):446-58. doi: 10.1016/j.ygeno.2005.11.009. Epub 2006 Jan 10.

The effect of single-nucleotide polymorphism marker selection on patterns of haplotype blocks and haplotype frequency estimates.单核苷酸多态性标记选择对单倍型块模式及单倍型频率估计的影响。

Am J Hum Genet. 2005 Dec;77(6):988-98. doi: 10.1086/498175. Epub 2005 Oct 19.

Analysis of concordance of different haplotype block partitioning algorithms.不同单倍型块划分算法的一致性分析

BMC Bioinformatics. 2005 Dec 15;6:303. doi: 10.1186/1471-2105-6-303.

A haplotype map of the human genome.人类基因组单倍型图谱。

Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.

Association between single-nucleotide polymorphisms in the SEC8L1 gene, which encodes a subunit of the exocyst complex, and rheumatoid arthritis in a Japanese population.编码外泌体复合体一个亚基的SEC8L1基因单核苷酸多态性与日本人群类风湿性关节炎之间的关联。

Arthritis Rheum. 2005 May;52(5):1371-80. doi: 10.1002/art.21013.

Genome-wide association studies for common diseases and complex traits.常见疾病和复杂性状的全基因组关联研究。

Nat Rev Genet. 2005 Feb;6(2):95-108. doi: 10.1038/nrg1521.

The impact of sample size and marker selection on the study of haplotype structures.样本量和标记选择对单倍型结构研究的影响。

Hum Genomics. 2004 Mar;1(3):179-93. doi: 10.1186/1479-7364-1-3-179.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

样本量对单倍型块识别的影响评估。

Evaluation of sample size effect on the identification of haplotype blocks.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献