Suppr超能文献

使用结构图形嵌入的快速蛋白质结构搜索

Fast protein structure searching using structure graph embeddings.

作者信息

Greener Joe G, Jamali Kiarash

机构信息

Medical Research Council Laboratory of Molecular Biology, Cambridge, CB2 0QH, United Kingdom.

出版信息

Bioinform Adv. 2024 Mar 5;5(1):vbaf042. doi: 10.1093/bioadv/vbaf042. eCollection 2025.

Abstract

UNLABELLED

Comparing and searching protein structures independent of primary sequence has proved useful for remote homology detection, function annotation, and protein classification. Fast and accurate methods to search with structures will be essential to make use of the vast databases that have recently become available, in the same way that fast protein sequence searching underpins much of bioinformatics. We train a simple graph neural network using supervised contrastive learning to learn a low-dimensional embedding of protein domains.

AVAILABILITY AND IMPLEMENTATION

The method, called Progres, is available as software at https://github.com/greener-group/progres and as a web server at https://progres.mrc-lmb.cam.ac.uk. It has accuracy comparable to the best current methods and can search the AlphaFold database TED domains in a 10th of a second per query on CPU.

摘要

未标记

独立于一级序列比较和搜索蛋白质结构已被证明在远程同源性检测、功能注释和蛋白质分类中很有用。快速准确的结构搜索方法对于利用最近可用的大量数据库至关重要,就像快速蛋白质序列搜索是许多生物信息学的基础一样。我们使用监督对比学习训练一个简单的图神经网络,以学习蛋白质结构域的低维嵌入。

可用性和实现

该方法称为Progres,可作为软件在https://github.com/greener-group/progres上获取,也可作为网络服务器在https://progres.mrc-lmb.cam.ac.uk上获取。它的准确性与当前最好的方法相当,并且在CPU上每个查询可以在十分之一秒内搜索AlphaFold数据库TED结构域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c037/11974391/f6b940bbe44a/vbaf042f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验