Suppr超能文献

基于蛋白质影响、序列保守性和基因注释的遗传分析研讨会18单核苷酸变异优先级排序

Genetic Analysis Workshop 18 single-nucleotide variant prioritization based on protein impact, sequence conservation, and gene annotation.

作者信息

Nalpathamkalam Thomas, Derkach Andriy, Paterson Andrew D, Merico Daniele

机构信息

The Centre for Applied Genomics, The Hospital for Sick Children, 101 College Street, M5G 1L7 Toronto, ON, Canada ; Program in Genetics and Genome Biology, The Hospital for Sick Children, 101 College Street, M5G 1L7 Toronto, ON, Canada.

Department of Statistics, University of Toronto, 100 St. George St., M5S 3G3 Toronto, ON, Canada.

出版信息

BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S11. doi: 10.1186/1753-6561-8-S1-S11. eCollection 2014.

Abstract

Grouping variants based on gene mapping can augment the power of rare variant association tests. Weighting or sorting variants based on their expected functional impact can provide additional benefit. We defined groups of prioritized variants based on systematic annotation of Genetic Analysis Workshop 18 (GAW18) single-nucleotide variants; we focused on variants detected by whole genome sequencing, specifically on the high-quality subset presented in the genotype files. First, we divided variants between coding and noncoding. Coding variants are fewer than 1% of the total and are more likely to have a biological effect than noncoding variants. Coding variants were further stratified into protein changing and protein damaging groups based on the effect on protein amino acid sequence. In particular, missense variants predicted to be damaging, splice-site alterations, and stop gains were assigned to the protein damaging category. Impact of noncoding variants is more difficult to predict. We decided to rely uniquely on conservation: we combined (a) the mammalian phastCons Conserved Element and (b) the PhyloP score, which identify conserved intervals and the single-nucleotide position, respectively. This reduced the noncoding variants to a number comparable to coding variants. Finally, using gene structure definition from the widely used RefSeq database, we mapped variants to genes to support association tests that require collapsing rare variants to genes. Companion GAW18 papers used these variant priority groups and gene mapping; one of these paper specifically found evidence of stronger association signal for protein damaging variants.

摘要

基于基因定位对变异进行分组可以增强罕见变异关联测试的效能。根据变异的预期功能影响对其进行加权或排序可能会带来额外的益处。我们基于对遗传分析研讨会18(GAW18)单核苷酸变异的系统注释定义了优先变异组;我们关注通过全基因组测序检测到的变异,特别是基因型文件中呈现的高质量子集。首先,我们将变异分为编码变异和非编码变异。编码变异占总数的比例不到1%,并且比非编码变异更有可能产生生物学效应。编码变异根据对蛋白质氨基酸序列的影响进一步分层为蛋白质改变组和蛋白质破坏组。特别是,预测为有害的错义变异、剪接位点改变和终止密码子获得被归为蛋白质破坏类别。非编码变异的影响更难预测。我们决定仅依靠保守性:我们结合了(a)哺乳动物phastCons保守元件和(b)PhyloP评分,它们分别识别保守区间和单核苷酸位置。这将非编码变异的数量减少到与编码变异相当的数量。最后,使用广泛使用的RefSeq数据库中的基因结构定义,我们将变异映射到基因,以支持需要将罕见变异合并到基因的关联测试。GAW18的配套论文使用了这些变异优先级组和基因定位;其中一篇论文特别发现了蛋白质破坏变异具有更强关联信号的证据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d91c/4143669/b48279d0bb7b/1753-6561-8-S1-S11-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验