Suppr超能文献

线粒体 DNA“命名混乱”及其对遗传数据解读的影响。

mtDNA "nomenclutter" and its consequences on the interpretation of genetic data.

机构信息

Human Biology and Primate Evolution, Freie Universität Berlin, Berlin, Germany.

出版信息

BMC Ecol Evol. 2024 Aug 19;24(1):110. doi: 10.1186/s12862-024-02288-1.

Abstract

Population-based studies of human mitochondrial genetic diversity often require the classification of mitochondrial DNA (mtDNA) haplotypes into more than 5400 described haplogroups, and further grouping those into hierarchically higher haplogroups. Such secondary haplogroup groupings (e.g., "macro-haplogroups") vary across studies, as they depend on the sample quality, technical factors of haplogroup calling, the aims of the study, and the researchers' understanding of the mtDNA haplogroup nomenclature. Retention of historical nomenclature coupled with a growing number of newly described mtDNA lineages results in increasingly complex and inconsistent nomenclature that does not reflect phylogeny well. This "clutter" leaves room for grouping errors and inconsistencies across scientific publications, especially when the haplogroup names are used as a proxy for secondary groupings, and represents a source for scientific misinterpretation. Here we explore the effects of phylogenetically insensitive secondary mtDNA haplogroup groupings, and the lack of standardized secondary haplogroup groupings on downstream analyses and interpretation of genetic data. We demonstrate that frequency-based analyses produce inconsistent results when different secondary mtDNA groupings are applied, and thus allow for vastly different interpretations of the same genetic data. The lack of guidelines and recommendations on how to choose appropriate secondary haplogroup groupings presents an issue for the interpretation of results, as well as their comparison and reproducibility across studies. To reduce biases originating from arbitrarily defined secondary nomenclature-based groupings, we suggest that future updates of mtDNA phylogenies aimed for the use in mtDNA haplogroup nomenclature should also provide well-defined and standardized sets of phylogenetically meaningful algorithm-based secondary haplogroup groupings such as "macro-haplogroups", "meso-haplogroups", and "micro-haplogroups". Ideally, each of the secondary haplogroup grouping levels should be informative about different human population history events. Those phylogenetically informative levels of haplogroup groupings can be easily defined using TreeCluster, and then implemented into haplogroup callers such as HaploGrep3. This would foster reproducibility across studies, provide a grouping standard for population-based studies, and reduce errors associated with haplogroup nomenclatures in future studies.

摘要

基于人群的人类线粒体遗传多样性研究通常需要将线粒体 DNA(mtDNA)单倍型分类为 5400 多个描述的单倍群,并进一步将这些单倍群分为层次更高的单倍群。这种二级单倍群分组(例如,“大单倍群”)因研究而异,因为它们取决于样本质量、单倍群调用的技术因素、研究目的以及研究人员对 mtDNA 单倍群命名法的理解。历史命名法的保留加上越来越多新描述的 mtDNA 谱系导致命名法越来越复杂和不一致,无法很好地反映系统发育。这种“混乱”为科学出版物中的分组错误和不一致留下了空间,尤其是当单倍群名称用作二级分组的代理时,这代表了科学误解的一个来源。在这里,我们探讨了系统发育不敏感的二级 mtDNA 单倍群分组以及缺乏标准化的二级单倍群分组对下游分析和遗传数据分析的影响。我们证明,当应用不同的二级 mtDNA 分组时,基于频率的分析会产生不一致的结果,因此允许对相同的遗传数据进行截然不同的解释。缺乏关于如何选择适当的二级单倍群分组的指导方针和建议,给结果的解释以及跨研究的比较和重现性带来了问题。为了减少源自任意定义的二级命名法分组的偏见,我们建议,针对 mtDNA 单倍群命名法使用的 mtDNA 系统发育的未来更新还应提供定义良好且标准化的基于算法的二级单倍群分组,例如“大单倍群”、“中单倍群”和“小单倍群”。理想情况下,每个二级单倍群分组级别都应该与不同的人类种群历史事件有关。可以使用 TreeCluster 轻松定义这些具有系统发育信息的单倍群分组级别,然后将其实现到 HaploGrep3 等单倍群调用器中。这将促进跨研究的可重复性,为基于人群的研究提供分组标准,并减少未来研究中单倍群命名法相关的错误。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e02b/11331612/bac5d03a96d1/12862_2024_2288_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验