文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

揭示自然蛋白质宇宙中的新家族和新折叠。

Uncovering new families and folds in the natural protein universe.

机构信息

Biozentrum, University of Basel, Basel, Switzerland.

SIB Swiss Institute of Bioinformatics, University of Basel, Basel, Switzerland.

出版信息

Nature. 2023 Oct;622(7983):646-653. doi: 10.1038/s41586-023-06622-3. Epub 2023 Sep 13.


DOI:10.1038/s41586-023-06622-3
PMID:37704037
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10584680/
Abstract

We are now entering a new era in protein sequence and structure annotation, with hundreds of millions of predicted protein structures made available through the AlphaFold database. These models cover nearly all proteins that are known, including those challenging to annotate for function or putative biological role using standard homology-based approaches. In this study, we examine the extent to which the AlphaFold database has structurally illuminated this 'dark matter' of the natural protein universe at high predicted accuracy. We further describe the protein diversity that these models cover as an annotated interactive sequence similarity network, accessible at https://uniprot3d.org/atlas/AFDB90v4 . By searching for novelties from sequence, structure and semantic perspectives, we uncovered the β-flower fold, added several protein families to Pfam database and experimentally demonstrated that one of these belongs to a new superfamily of translation-targeting toxin-antitoxin systems, TumE-TumA. This work underscores the value of large-scale efforts in identifying, annotating and prioritizing new protein families. By leveraging the recent deep learning revolution in protein bioinformatics, we can now shed light into uncharted areas of the protein universe at an unprecedented scale, paving the way to innovations in life sciences and biotechnology.

摘要

我们现在正进入蛋白质序列和结构注释的新时代,通过 AlphaFold 数据库提供了数亿个预测的蛋白质结构。这些模型几乎涵盖了所有已知的蛋白质,包括那些使用标准同源性方法难以注释功能或假定生物学作用的蛋白质。在这项研究中,我们研究了 AlphaFold 数据库在高预测精度下在多大程度上阐明了自然蛋白质宇宙中的这种“暗物质”。我们进一步描述了这些模型所涵盖的蛋白质多样性,作为一个带有注释的交互式序列相似性网络,可在 https://uniprot3d.org/atlas/AFDB90v4 访问。通过从序列、结构和语义角度搜索新颖性,我们发现了β-花折叠,向 Pfam 数据库添加了几个蛋白质家族,并通过实验证明其中一个属于一种新的翻译靶向毒素-抗毒素系统的超家族,即 TumE-TumA。这项工作强调了在识别、注释和优先考虑新蛋白质家族方面的大规模努力的价值。通过利用蛋白质生物信息学中的深度学习革命,我们现在可以以前所未有的规模揭示蛋白质宇宙中未知的领域,为生命科学和生物技术的创新铺平道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/fe6beb12ddbd/41586_2023_6622_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/d8452398ef69/41586_2023_6622_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/38c85e8946e6/41586_2023_6622_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/09f0e0ec1ce9/41586_2023_6622_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/a3350aefbe36/41586_2023_6622_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/4aefad19f072/41586_2023_6622_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/d23e3979f574/41586_2023_6622_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/fc939090bdf1/41586_2023_6622_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/65d16ef3b15c/41586_2023_6622_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/386f40ea9822/41586_2023_6622_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/163529d49c42/41586_2023_6622_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/10e0da52a03f/41586_2023_6622_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/fe6beb12ddbd/41586_2023_6622_Fig12_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/d8452398ef69/41586_2023_6622_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/38c85e8946e6/41586_2023_6622_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/09f0e0ec1ce9/41586_2023_6622_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/a3350aefbe36/41586_2023_6622_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/4aefad19f072/41586_2023_6622_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/d23e3979f574/41586_2023_6622_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/fc939090bdf1/41586_2023_6622_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/65d16ef3b15c/41586_2023_6622_Fig8_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/386f40ea9822/41586_2023_6622_Fig9_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/163529d49c42/41586_2023_6622_Fig10_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/10e0da52a03f/41586_2023_6622_Fig11_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/909d/10584680/fe6beb12ddbd/41586_2023_6622_Fig12_ESM.jpg

相似文献

[1]
Uncovering new families and folds in the natural protein universe.

Nature. 2023-10

[2]
Clustering predicted structures at the scale of the known protein universe.

Nature. 2023-10

[3]
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.

Nucleic Acids Res. 2022-1-7

[4]
New functional families (FunFams) in CATH to improve the mapping of conserved functional sites to 3D structures.

Nucleic Acids Res. 2012-11-29

[5]
The PAS fold. A redefinition of the PAS domain based upon structural prediction.

Eur J Biochem. 2004-3

[6]
Using deep learning to annotate the protein universe.

Nat Biotechnol. 2022-6

[7]
Exploration of uncharted regions of the protein universe.

PLoS Biol. 2009-9-29

[8]
SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes.

Nucleic Acids Res. 2002-1-1

[9]
Protein folds and families: sequence and structure alignments.

Nucleic Acids Res. 1999-1-1

[10]
Sequence-structure-function relationships in the microbial protein universe.

Nat Commun. 2023-4-26

引用本文的文献

[1]
Large protein databases reveal structural complementarity and functional locality.

Nat Commun. 2025-8-25

[2]
Protein functional site annotation using local structure embeddings.

Proc Natl Acad Sci U S A. 2025-8-26

[3]
AlphaCD: a machine learning model capable of highly accurate characterization for 21,335 cytidine deaminases.

Cell Res. 2025-8-18

[4]
Deciphering the proteome of K-12: Integrating transcriptomics and machine learning to annotate hypothetical proteins.

Comput Struct Biotechnol J. 2025-7-24

[5]
The topological properties of the protein universe.

Nat Commun. 2025-8-13

[6]
Cyanobacteria and Soil Restoration: Bridging Molecular Insights with Practical Solutions.

Microorganisms. 2025-6-24

[7]
Hydrogel particle-based protein display enabled by particle-templated emulsification.

RSC Adv. 2025-7-23

[8]
A highly potent human antibody neutralizing all serotypes of BK polyomavirus.

PLoS Pathog. 2025-7-18

[9]
The role of metabolism in shaping enzyme structures over 400 million years.

Nature. 2025-7-9

[10]
Tracing the function expansion for a primordial protein fold in the era of fold-based function prediction: β-trefoil.

PLoS One. 2025-7-3

本文引用的文献

[1]
Fast and accurate protein structure search with Foldseek.

Nat Biotechnol. 2024-2

[2]
Evolutionary-scale prediction of atomic-level protein structure with a language model.

Science. 2023-3-17

[3]
AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms.

Commun Biol. 2023-2-8

[4]
MGnify: the microbiome sequence data analysis resource in 2023.

Nucleic Acids Res. 2023-1-6

[5]
InterPro in 2022.

Nucleic Acids Res. 2023-1-6

[6]
A structural biology community assessment of AlphaFold2 applications.

Nat Struct Mol Biol. 2022-11

[7]
PGRS domain structures: Doomed to sail the mycomembrane.

PLoS Pathog. 2022-9

[8]
US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes.

Nat Methods. 2022-9

[9]
A hyperpromiscuous antitoxin protein domain for the neutralization of diverse toxin domains.

Proc Natl Acad Sci U S A. 2022-2-8

[10]
Biology and evolution of bacterial toxin-antitoxin systems.

Nat Rev Microbiol. 2022-6

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索