Stothard Paul, Liao Xiaoping, Arantes Adriano S, De Pauw Mary, Coros Colin, Plastow Graham S, Sargolzaei Mehdi, Crowley John J, Basarab John A, Schenkel Flavio, Moore Stephen, Miller Stephen P
Department of Agricultural, Food and Nutritional Science / Livestock Gentec, University of Alberta, Edmonton, AB Canada.
Department of Agricultural, Food and Nutritional Science / Livestock Gentec, University of Alberta, Edmonton, AB Canada ; Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China.
Gigascience. 2015 Oct 26;4:49. doi: 10.1186/s13742-015-0090-5. eCollection 2015.
The Canadian Cattle Genome Project is a large-scale international project that aims to develop genomics-based tools to enhance the efficiency and sustainability of beef and dairy production. Obtaining DNA sequence information is an important part of achieving this goal as it facilitates efforts to associate specific DNA differences with phenotypic variation. These associations can be used to guide breeding decisions and provide valuable insight into the molecular basis of traits.
We describe a dataset of 379 whole-genome sequences, taken primarily from key historic Bos taurus animals, along with the analyses that were performed to assess data quality. The sequenced animals represent ten populations relevant to beef or dairy production. Animal information (name, breed, population), sequence data metrics (mapping rate, depth, concordance), and sequence repository identifiers (NCBI BioProject and BioSample IDs) are provided to enable others to access and exploit this sequence information.
The large number of whole-genome sequences generated as a result of this project will contribute to ongoing work aiming to catalogue the variation that exists in cattle as well as efforts to improve traits through genotype-guided selection. Studies of gene function, population structure, and sequence evolution are also likely to benefit from the availability of this resource.
加拿大牛基因组计划是一个大型国际项目,旨在开发基于基因组学的工具,以提高牛肉和奶制品生产的效率与可持续性。获取DNA序列信息是实现这一目标的重要组成部分,因为它有助于将特定的DNA差异与表型变异联系起来。这些关联可用于指导育种决策,并为性状的分子基础提供有价值的见解。
我们描述了一个包含379个全基因组序列的数据集,这些序列主要取自关键的历史悠久的肉牛品种动物,并介绍了为评估数据质量而进行的分析。测序的动物代表了与牛肉或奶制品生产相关的十个群体。提供了动物信息(名称、品种、群体)、序列数据指标(映射率、深度、一致性)以及序列存储库标识符(NCBI生物项目和生物样本ID),以便其他人能够访问和利用这些序列信息。
该项目产生的大量全基因组序列将有助于正在进行的旨在编目牛中存在的变异的工作,以及通过基因型指导选择来改善性状的努力。基因功能、群体结构和序列进化的研究也可能受益于这一资源的可用性。