Suppr超能文献

犬科动物相对于格陵兰狼外群的序列变异图谱。

A map of canine sequence variation relative to a Greenland wolf outgroup.

机构信息

Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.

出版信息

Mamm Genome. 2024 Dec;35(4):565-576. doi: 10.1007/s00335-024-10056-1. Epub 2024 Aug 1.

Abstract

For over 15 years, canine genetics research relied on a reference assembly from a Boxer breed dog named Tasha (i.e., canFam3.1). Recent advances in long-read sequencing and genome assembly have led to the development of numerous high-quality assemblies from diverse canines. These assemblies represent notable improvements in completeness, contiguity, and the representation of gene promoters and gene models. Although genome graph and pan-genome approaches have promise, most genetic analyses in canines rely upon the mapping of Illumina sequencing reads to a single reference. The Dog10K consortium, and others, have generated deep catalogs of genetic variation through an alignment of Illumina sequencing reads to a reference genome obtained from a German Shepherd Dog named Mischka (i.e., canFam4, UU_Cfam_GSD_1.0). However, alignment to a breed-derived genome may introduce bias in genotype calling across samples. Since the use of an outgroup reference genome may remove this effect, we have reprocessed 1929 samples analyzed by the Dog10K consortium using a Greenland wolf (mCanLor1.2) as the reference. We efficiently performed remapping and variant calling using a GPU-implementation of common analysis tools. The resulting call set removes the variability in genetic differences seen across samples and breed relationships revealed by principal component analysis are not affected by the choice of reference genome. Using this sequence data, we inferred the history of population sizes and found that village dog populations experienced a 9-13 fold reduction in historic effective population size relative to wolves.

摘要

15 年来,犬类遗传学研究依赖于一只名为 Tasha(即 canFam3.1)的拳师犬的参考基因组。最近,长读测序和基因组组装技术的进步,推动了来自不同犬种的大量高质量基因组组装的发展。这些组装在完整性、连续性以及基因启动子和基因模型的表示方面都有显著的提高。尽管基因组图谱和泛基因组方法具有很大的潜力,但犬类的大多数遗传分析仍然依赖于将 Illumina 测序reads 映射到单个参考基因组上。Dog10K 联盟和其他组织通过将 Illumina 测序reads 与一只名为 Mischka(即 canFam4、UU_Cfam_GSD_1.0)的德国牧羊犬的参考基因组对齐,生成了大量遗传变异的目录。然而,与品种衍生的基因组对齐可能会导致在跨样本的基因型调用中引入偏差。由于使用外群参考基因组可能会消除这种影响,我们使用一只格陵兰狼(mCanLor1.2)作为参考,重新处理了 Dog10K 联盟分析的 1929 个样本。我们使用常见分析工具的 GPU 实现,高效地执行了重新映射和变异调用。所得的调用集消除了样本间遗传差异的可变性,主成分分析所揭示的品种关系不受参考基因组选择的影响。使用这些序列数据,我们推断了种群规模的历史,并发现与狼相比,村庄犬种群的历史有效种群数量减少了 9-13 倍。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验