Suppr超能文献

小鼠协作杂交群体的基因组

Genomes of the Mouse Collaborative Cross.

作者信息

Srivastava Anuj, Morgan Andrew P, Najarian Maya L, Sarsani Vishal Kumar, Sigmon J Sebastian, Shorter John R, Kashfeen Anwica, McMullan Rachel C, Williams Lucy H, Giusti-Rodríguez Paola, Ferris Martin T, Sullivan Patrick, Hock Pablo, Miller Darla R, Bell Timothy A, McMillan Leonard, Churchill Gary A, de Villena Fernando Pardo-Manuel

机构信息

The Jackson Laboratory, Bar Harbor, Maine 04609.

Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27599.

出版信息

Genetics. 2017 Jun;206(2):537-556. doi: 10.1534/genetics.116.198838.

Abstract

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.

摘要

协作杂交(CC)是一个由八个奠基实验室品系衍生而来的重组近交(RI)小鼠品系的多亲面板。RI面板很受欢迎,因为它们具有长期的遗传稳定性,这增强了跨时间和条件收集的数据的可重复性和整合性。对其基因组进行表征可以是一项集体努力,减轻了个体用户的负担。在这里,我们使用两种互补方法展示了CC品系的基因组,作为一种资源来提高遗传实验的效能和解读。我们的研究还提供了一个警示故事,讲述了诸如突变和选择等基本生物学过程所带来的局限性。近交面板的一个显著优势是,基因分型只需要在面板上进行,而不必在每只小鼠上进行。最初的CC基因组数据是基于对每个品系最近共同祖先(MRCA)的密集基因分型,然后从相应奠基近交品系的基因组序列进行推断的单倍型重建。MRCA资源捕获了未完全近交品系中的分离区域,但在奠基者单倍型之间的过渡区域分辨率有限,并且在多样性有限的区域中奠基者归属存在不确定性。在这里,我们报告了通过对每个品系的单个雄性进行30倍覆盖的双末端短读长测序产生的69个CC品系的全基因组序列。测序显著改善了CC基因组的精细结构和完整性。MRCA和测序样本均显示,来自两个野生衍生品系CAST/EiJ和PWK/PhJ的全基因组单倍型频率显著降低。此外,对杂合性模式演变的分析表明,针对三个野生衍生奠基品系的选择在塑造CC基因组方面发挥了重要作用。该测序资源首次描述了CC基因组中由突变和漂变引入的数万个新遗传变异。我们估计,每个CC品系中,新的SNP突变以每代每千兆碱基2.4±0.4的速率积累。通过遗传漂变固定新突变已将数千个新变异引入CC品系。与目前测序的实验室品系和野生小鼠相比,这些突变中的大多数是新的,并且一些预计会改变基因功能。大约三分之一的CC近交品系获得了大的缺失(>10 kb),其中许多与已知的编码基因和功能元件重叠。这些小鼠的序列是CC用户的关键资源,使公开可用的小鼠近交品系基因组数量增加了三倍,并提供了对突变和漂变对常见资源影响的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a17/5499171/ca5a33161f01/537fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验