Grenier Jennifer K, Arguello J Roman, Moreira Margarida Cardoso, Gottipati Srikanth, Mohammed Jaaved, Hackett Sean R, Boughton Rachel, Greenberg Anthony J, Clark Andrew G
Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853.
Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853 Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland CH-1015.
G3 (Bethesda). 2015 Feb 11;5(4):593-603. doi: 10.1534/g3.114.015883.
Reference collections of multiple Drosophila lines with accumulating collections of "omics" data have proven especially valuable for the study of population genetics and complex trait genetics. Here we present a description of a resource collection of 84 strains of Drosophila melanogaster whose genome sequences were obtained after 12 generations of full-sib inbreeding. The initial rationale for this resource was to foster development of a systems biology platform for modeling metabolic regulation by the use of natural polymorphisms as perturbations. As reference lines, they are amenable to repeated phenotypic measurements, and already a large collection of metabolic traits have been assayed. Another key feature of these strains is their widespread geographic origin, coming from Beijing, Ithaca, Netherlands, Tasmania, and Zimbabwe. After obtaining 12.5× coverage of paired-end Illumina sequence reads, SNP and indel calls were made with the GATK platform. Thorough quality control was enabled by deep sequencing one line to >100×, and single-nucleotide polymorphisms and indels were validated using ddRAD-sequencing as an orthogonal platform. In addition, a series of preliminary population genetic tests were performed with these single-nucleotide polymorphism data for assessment of data quality. We found 83 segregating inversions among the lines, and as expected these were especially abundant in the African sample. We anticipate that this will make a useful addition to the set of reference D. melanogaster strains, thanks to its geographic structuring and unusually high level of genetic diversity.
具有不断积累的“组学”数据的多个果蝇品系的参考文库,已被证明对群体遗传学和复杂性状遗传学的研究特别有价值。在此,我们描述了一个黑腹果蝇资源文库,其中包含84个品系,这些品系的基因组序列是在经过12代全同胞近交后获得的。建立这个资源文库的最初理由是通过利用自然多态性作为扰动来促进一个用于代谢调控建模的系统生物学平台的发展。作为参考品系,它们适合进行重复的表型测量,并且已经测定了大量的代谢性状。这些品系的另一个关键特征是它们广泛的地理来源,来自北京、伊萨卡、荷兰、塔斯马尼亚和津巴布韦。在获得双端Illumina序列读数的12.5倍覆盖度后,使用GATK平台进行单核苷酸多态性(SNP)和插入缺失(indel)的检测。通过对一个品系进行深度测序至>100倍覆盖度实现了全面的质量控制,并使用ddRAD测序作为正交平台对单核苷酸多态性和插入缺失进行了验证。此外,利用这些单核苷酸多态性数据进行了一系列初步的群体遗传学测试,以评估数据质量。我们在这些品系中发现了83个分离倒位,正如预期的那样,这些倒位在非洲样本中特别丰富。我们预计,由于其地理结构和异常高的遗传多样性水平,这将为黑腹果蝇参考品系集增添有用的内容。