Suppr超能文献

预测线粒体DNA单倍型的采样饱和度:在一个扩充的葡萄牙数据库中的应用。

Predicting sampling saturation of mtDNA haplotypes: an application to an enlarged Portuguese database.

作者信息

Pereira Luísa, Cunha Carla, Amorim António

机构信息

IPATIMUP (Instituto de Patologia e Imunologia Molecular da Universidade do Porto), R. Dr. Roberto Frias s/n, 4200-465 Porto, Portugal.

出版信息

Int J Legal Med. 2004 Jun;118(3):132-6. doi: 10.1007/s00414-003-0424-1. Epub 2004 Feb 11.

Abstract

An enlarged mtDNA database ( n=549) for the Portuguese population, comprising HVRI and HVRII regions is reported. This database was used to test the effect of sample size on the estimation of relevant parameters such as haplotype diversity, number of different haplotypes, nucleotide diversity and number of polymorphic positions. Simulations were performed generating sets of random subsamples of variable sizes ( n=50, 100, 200, 300 and 400). The results show that while haplotype and nucleotide diversities do not vary significantly with sample size, the numbers of haplotypes and polymorphic positions rise continuously inside the tested interval. These trends are interpretable by the evolution of the proportions of sequences that are found once or twice, which drop dramatically as sample size increases, with the corresponding rise in the frequency of those encountered 3 times or more. The generated data were also used to extrapolate saturation curves for the referred parameters. When considering for instance the number of haplotypes, it is shown that a sample size of 1,000 individuals is required for practical saturation (defined as the point where a sample size increase of 100 individuals corresponds to an increment in the diversity measure below 5%). For HVRII the same level is reached at n=900 and n=1,300 is needed when both regions are analysed simultaneously. Consequently, we can infer that currently used sample sizes are still rather inadequate for both anthropological and forensic purposes.

摘要

报道了一个针对葡萄牙人群的扩大线粒体DNA数据库(n = 549),该数据库包含高变区I(HVRI)和高变区II(HVRII)。此数据库用于测试样本量对相关参数估计的影响,这些参数包括单倍型多样性、不同单倍型数量、核苷酸多样性以及多态性位点数量。进行了模拟,生成了不同大小(n = 50、100、200、300和400)的随机子样本集。结果表明,虽然单倍型和核苷酸多样性不会随样本量显著变化,但在测试区间内,单倍型数量和多态性位点数量持续上升。这些趋势可以通过仅出现一次或两次的序列比例的变化来解释,随着样本量增加,该比例急剧下降,而出现三次或更多次的序列频率相应上升。生成的数据还用于推断上述参数的饱和曲线。例如,考虑单倍型数量时,结果表明实际饱和(定义为样本量增加100个个体对应多样性指标增量低于5%的点)所需的样本量为1000个个体。对于HVRII,在n = 900时达到相同水平,同时分析两个区域时需要n = 1300。因此,我们可以推断,目前使用的样本量对于人类学和法医学目的而言仍然相当不足。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验