错义 3D-DB 网络目录:基于原子的 400 万个人类蛋白质编码遗传变异的分析和存储库。

Missense3D-DB web catalogue: an atom-based analysis and repository of 4M human protein-coding genetic variants.

机构信息

Department of Life Sciences, Centre for Integrative System Biology and Bioinformatics, Imperial College London, London, SW7 2AZ, UK.

出版信息

Hum Genet. 2021 May;140(5):805-812. doi: 10.1007/s00439-020-02246-z. Epub 2021 Jan 27.

Abstract

The interpretation of human genetic variation is one of the greatest challenges of modern genetics. New approaches are urgently needed to prioritize variants, especially those that are rare or lack a definitive clinical interpretation. We examined 10,136,597 human missense genetic variants from GnomAD, ClinVar and UniProt. We were able to perform large-scale atom-based mapping and phenotype interpretation of 3,960,015 of these variants onto 18,874 experimental and 84,818 in house predicted three-dimensional coordinates of the human proteome. We demonstrate that 14% of amino acid substitutions from the GnomAD database that could be structurally analysed are predicted to affect protein structure (n = 568,548, of which 566,439 rare or extremely rare) and may, therefore, have a yet unknown disease-causing effect. The same is true for 19.0% (n = 6266) of variants of unknown clinical significance or conflicting interpretation reported in the ClinVar database. The results of the structural analysis are available in the dedicated web catalogue Missense3D-DB ( http://missense3d.bc.ic.ac.uk/ ). For each of the 4 M variants, the results of the structural analysis are presented in a friendly concise format that can be included in clinical genetic reports. A detailed report of the structural analysis is also available for the non-experts in structural biology. Population frequency and predictions from SIFT and PolyPhen are included for a more comprehensive variant interpretation. This is the first large-scale atom-based structural interpretation of human genetic variation and offers geneticists and the biomedical community a new approach to genetic variant interpretation.

摘要

人类遗传变异的解读是现代遗传学面临的最大挑战之一。目前迫切需要新的方法来对变异进行优先级排序,尤其是那些罕见或缺乏明确临床解释的变异。我们研究了 GnomAD、ClinVar 和 UniProt 中 10136597 个人类错义遗传变异。我们能够将这些变异中的 3960015 个大规模地基于原子进行映射,并对其表型进行解释,这些变异被映射到了人类蛋白质组的 18874 个实验和 84818 个内部预测的三维坐标上。我们证明,在 GnomAD 数据库中可以进行结构分析的氨基酸替换中有 14%(n=568548,其中 566439 是罕见或极罕见的)预计会影响蛋白质结构,因此可能具有未知的致病作用。ClinVar 数据库中报告的未知临床意义或解释冲突的变异中有 19.0%(n=6266)也是如此。结构分析的结果可在专门的 Missense3D-DB 网络目录中查看(http://missense3d.bc.ic.ac.uk/)。对于 400 万个变异中的每一个,结构分析的结果都以友好简洁的格式呈现,可包含在临床遗传报告中。对于非结构生物学专家,也提供了详细的结构分析报告。还包括了人群频率和 SIFT 和 PolyPhen 的预测,以进行更全面的变异解释。这是首次对人类遗传变异进行大规模基于原子的结构解读,为遗传学家和生物医学社区提供了一种新的遗传变异解读方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/02e3/8052235/49d7e1df7640/439_2020_2246_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索