Suppr超能文献

深度学习利用迭代预测的结构约束来扩展从头开始的蛋白质建模对基因组的覆盖范围。

Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints.

机构信息

Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.

The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK.

出版信息

Nat Commun. 2019 Sep 4;10(1):3977. doi: 10.1038/s41467-019-11994-0.

Abstract

The inapplicability of amino acid covariation methods to small protein families has limited their use for structural annotation of whole genomes. Recently, deep learning has shown promise in allowing accurate residue-residue contact prediction even for shallow sequence alignments. Here we introduce DMPfold, which uses deep learning to predict inter-atomic distance bounds, the main chain hydrogen bond network, and torsion angles, which it uses to build models in an iterative fashion. DMPfold produces more accurate models than two popular methods for a test set of CASP12 domains, and works just as well for transmembrane proteins. Applied to all Pfam domains without known structures, confident models for 25% of these so-called dark families were produced in under a week on a small 200 core cluster. DMPfold provides models for 16% of human proteome UniProt entries without structures, generates accurate models with fewer than 100 sequences in some cases, and is freely available.

摘要

氨基酸共变方法在小蛋白家族中的不适用性限制了它们在整个基因组结构注释中的应用。最近,深度学习在允许进行准确的残基-残基接触预测方面显示出了潜力,即使是在浅层序列比对的情况下。在这里,我们介绍了 DMPfold,它使用深度学习来预测原子间距离边界、主链氢键网络和扭转角,然后它可以使用这些信息以迭代的方式构建模型。在 CASP12 结构域的测试集中,DMPfold 产生的模型比两种流行的方法更准确,并且对跨膜蛋白也同样有效。应用于所有没有已知结构的 Pfam 结构域,在一个小型的 200 核集群上,在不到一周的时间内,就可以为 25%的所谓暗家族生成有信心的模型。DMPfold 为 16%没有结构的人类蛋白质组 UniProt 条目提供模型,在某些情况下,即使只有不到 100 个序列,也能生成准确的模型,而且它是免费提供的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0ad/6726615/0b452c5799b5/41467_2019_11994_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验