Suppr超能文献

ExAgBov:一个公开的数据库,包含数百个牛全外显子组测序样本中的注释变异。

ExAgBov: A public database of annotated variations from hundreds of bovine whole-exome sequencing samples.

机构信息

Department of Ruminant Science, Institute of Animal Sciences, Agricultural Research Organization, The Volcani Center, Rishon LeZion, 7505101, Israel.

Department of Animal Sciences, Robert H. Smith Faculty of Agriculture, Food and Environment, the Hebrew University, Rehovot, 76100, Israel.

出版信息

Sci Data. 2022 Aug 2;9(1):469. doi: 10.1038/s41597-022-01597-8.

Abstract

Large reference datasets of annotated genetic variations from genome-scale sequencing are essential for interpreting identified variants, their functional impact, and their possible contribution to diseases and traits. However, to date, no such database of annotated variation from broad cattle populations is publicly available. To overcome this gap and advance bovine NGS-driven variant discovery and interpretation, we obtained and analyzed raw data deposited in the SRA public repository. Short reads from 262 whole-exome sequencing samples of Bos Taurus were mapped to the Bos Taurus ARS-UCD1.2 reference genome. The GATK best practice workflow was applied for variant calling. Comprehensive annotation of all recorded variants was done using the Ensembl Variant Effect Predictor (VEP). An in-depth analysis of the population structure revealed the breeds comprising the database. The Exomes Aggregate of Bovine- ExAgBov is a comprehensively annotated dataset of more than 20 million short variants, of which ~2% are located within open reading frames, splice regions, and UTRs, and more than 60,000 variants are predicted to be deleterious.

摘要

大型参考数据集的注释遗传变异从全基因组测序对于解释鉴定的变异,它们的功能影响,以及它们对疾病和特征的可能贡献至关重要。然而,迄今为止,还没有这样一个公开可用的注释性变异数据库来自广泛的牛种群。为了克服这一差距,推进牛 NGS 驱动的变异发现和解释,我们获得并分析了存储在 SRA 公共存储库中的原始数据。来自 262 个全外显子组测序样本的短读序列被映射到 Bos Taurus ARS-UCD1.2 参考基因组。应用 GATK 最佳实践工作流程进行变异调用。使用 Ensembl Variant Effect Predictor(VEP)对所有记录的变体进行综合注释。对群体结构的深入分析揭示了构成数据库的品种。Exomes Aggregate of Bovine-ExAgBov 是一个包含超过 2000 万个短变异的综合注释数据集,其中~2%位于开放阅读框、剪接区域和 UTR 中,超过 60000 个变异被预测为有害的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ffe9/9345876/457db6880f5a/41597_2022_1597_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验