Suppr超能文献

DBSAV 数据库:预测人类蛋白质组中单氨基酸变异的有害性。

The DBSAV Database: Predicting Deleteriousness of Single Amino Acid Variations in the Human Proteome.

机构信息

Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.

Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Departments of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.

出版信息

J Mol Biol. 2021 May 28;433(11):166915. doi: 10.1016/j.jmb.2021.166915. Epub 2021 Mar 4.

Abstract

Deleterious single amino acid variation (SAV) is one of the leading causes of human diseases. Evaluating the functional impact of SAVs is crucial for diagnosis of genetic disorders. We previously developed a deep convolutional neural network predictor, DeepSAV, to evaluate the deleterious effects of SAVs on protein function based on various sequence, structural, and functional properties. DeepSAV scores of rare SAVs observed in the human population are aggregated into a gene-level score called GTS (Gene Tolerance of rare SAVs) that reflects a gene's tolerance to deleterious missense mutations and serves as a useful tool to study gene-disease associations. In this study, we aim to enhance the performance of DeepSAV by using expanded datasets of pathogenic and benign variants, more features, and neural network optimization. We found that multiple sequence alignments built from vertebrate-level orthologs yield better prediction results compared to those built from mammalian-level orthologs. For multiple sequence alignments built from BLAST searches, optimal performance was achieved with a sequence identify cutoff of 50% to remove distant homologs. The new version of DeepSAV exhibits the best performance among standalone predictors of deleterious effects of SAVs. We developed the DBSAV database (http://prodata.swmed.edu/DBSAV) that reports GTS scores of human genes and DeepSAV scores of SAVs in the human proteome, including pathogenic and benign SAVs, population-level SAVs, and all possible SAVs by single nucleotide variations. This database serves as a useful resource for research of human SAVs and their relationships with protein functions and human diseases.

摘要

有害的单一氨基酸变异 (SAV) 是人类疾病的主要原因之一。评估 SAV 的功能影响对于遗传疾病的诊断至关重要。我们之前开发了一种深度卷积神经网络预测器 DeepSAV,该预测器基于各种序列、结构和功能特性来评估 SAV 对蛋白质功能的有害影响。在人类群体中观察到的稀有 SAV 的 DeepSAV 得分被汇总到一个称为 GTS(稀有 SAV 的基因耐受性)的基因水平得分中,该得分反映了一个基因对有害错义突变的耐受性,并作为研究基因-疾病关联的有用工具。在这项研究中,我们旨在通过使用扩展的致病性和良性变体数据集、更多特征和神经网络优化来提高 DeepSAV 的性能。我们发现,与基于哺乳动物水平同源物构建的多重序列比对相比,基于脊椎动物水平同源物构建的多重序列比对产生了更好的预测结果。对于基于 BLAST 搜索构建的多重序列比对,最佳性能是通过使用序列同一性截止值为 50% 来去除远程同源物实现的。新版本的 DeepSAV 在 SAV 有害影响的独立预测器中表现出最佳性能。我们开发了 DBSAV 数据库(http://prodata.swmed.edu/DBSAV),该数据库报告了人类基因的 GTS 得分和人类蛋白质组中 SAV 的 DeepSAV 得分,包括致病性和良性 SAV、人群水平的 SAV 以及通过单核苷酸变异产生的所有可能的 SAV。该数据库是研究人类 SAV 及其与蛋白质功能和人类疾病关系的有用资源。

相似文献

引用本文的文献

6
Advances and Trends in Omics Technology Development.组学技术发展的进展与趋势
Front Med (Lausanne). 2022 Jul 1;9:911861. doi: 10.3389/fmed.2022.911861. eCollection 2022.

本文引用的文献

4
ClinVar: improvements to accessing data.ClinVar:访问数据的改进。
Nucleic Acids Res. 2020 Jan 8;48(D1):D835-D844. doi: 10.1093/nar/gkz972.
5
The PSIPRED Protein Analysis Workbench: 20 years on.PSIPRED 蛋白质分析工作平台:20 年的发展
Nucleic Acids Res. 2019 Jul 2;47(W1):W402-W407. doi: 10.1093/nar/gkz297.
7
UniProt: a worldwide hub of protein knowledge.UniProt:蛋白质知识的全球枢纽。
Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515. doi: 10.1093/nar/gky1049.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验