Suppr超能文献

COGA 人群样本中 X 染色体的单体型结构及其现有软件包重建的质量。

Haplotypic structure of the X chromosome in the COGA population sample and the quality of its reconstruction by extant software packages.

机构信息

Center of Statistical Genetics, c/o Centro Retrovirus, SS Abetone e Brennero 2, 56127 Pisa, Italy.

出版信息

BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S77. doi: 10.1186/1471-2156-6-S1-S77.

Abstract

BACKGROUND

The haplotypes of the X chromosome are accessible to direct count in males, whereas the diplotypes of the females may be inferred knowing the haplotype of their sons or fathers. Here, we investigated: 1) the possible large-scale haplotypic structure of the X chromosome in a Caucasian population sample, given the single-nucleotide polymorphism (SNP) maps and genotypes provided by Illumina and Affimetrix for Genetic Analysis Workshop 14, and, 2) the performances of widely used programs in reconstructing haplotypes from population genotypic data, given their known distribution in a sample of unrelated individuals.

RESULTS

All possible unrelated mother-son pairs of Caucasian ancestry (N = 104) were selected from the 143 families of the Collaborative Study on the Genetics of Alcoholism pedigree files, and the diplotypes of the mothers were inferred from the X chromosomes of their sons. The marker set included 313 SNPs at an average density of 0.47 Mb. Linkage disequilibrium between pairs of markers was computed by the parameter D', whereas for measuring multilocus disequilibrium, we developed here an index called D*, and applied it to all possible sliding windows of 5 markers each. Results showed a complex pattern of haplotypic structure, with regions of low linkage disequilibrium separated by regions of high values of D*. The following programs were evaluated for their accuracy in inferring population haplotype frequencies: 1) ARLEQUIN 2.001; 2) PHASE 2.1.1; 3) SNPHAP 1.1; 4) HAPLOBLOCK 1.2; 5) HAPLOTYPER 1.0. Performances were evaluated by Pearson correlation (r) coefficient between the true and the inferred distribution of haplotype frequencies.

CONCLUSION

The SNP haplotypic structure of the X chromosome is complex, with regions of high haplotype conservation interspersed among regions of higher haplotype diversity. All the tested programs were accurate (r = 1) in reconstructing the distribution of haplotype frequencies in case of high D* values. However, only the program PHASE realized a high correlation coefficient (r > 0.7) in conditions of low linkage disequilibrium.

摘要

背景

在男性中,X 染色体的单体型可以直接计数,而女性的二倍型则可以通过了解其儿子或父亲的单体型来推断。在这里,我们研究了:1)在一个白种人群体样本中,考虑到 Illumina 和 Affimetrix 为遗传分析工作坊 14 提供的单核苷酸多态性 (SNP) 图谱和基因型,X 染色体可能存在的大规模单体型结构,以及 2)在一个无关个体样本中,广泛使用的程序在从群体基因型数据中重建单体型方面的表现,这些程序已知分布情况。

结果

从酒精遗传研究协作组家系文件的 143 个家庭中选择了所有可能的非相关母子白种人后裔(N = 104)对,并从儿子的 X 染色体推断出母亲的二倍型。标记集包括 313 个 SNP,平均密度为 0.47 Mb。通过参数 D'计算标记对之间的连锁不平衡,而对于测量多位点不平衡,我们在这里开发了一个称为 D的指数,并将其应用于所有可能的 5 个标记滑动窗口。结果显示出单体型结构的复杂模式,低连锁不平衡区域与高 D值区域分开。以下程序用于评估其推断群体单体型频率的准确性:1)ARLEQUIN 2.001;2)PHASE 2.1.1;3)SNPHAP 1.1;4)HAPLOBLOCK 1.2;5)HAPLOTYPER 1.0。通过真和推断的单体型频率分布之间的皮尔逊相关系数 (r) 系数来评估性能。

结论

X 染色体的 SNP 单体型结构很复杂,高单体型保守区域与高单体型多样性区域交错。在 D*值较高的情况下,所有测试的程序都可以准确地(r = 1)重建单体型频率的分布。然而,只有程序 PHASE 在低连锁不平衡的情况下实现了高相关系数(r > 0.7)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/012d/1866704/abcebf0682a1/1471-2156-6-S1-S77-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验