• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

鲁比:一种快速而准确的纯几何蛋白质结构搜索方法。

RUPEE: A fast and accurate purely geometric protein structure search.

机构信息

School of Computing and Engineering, University of Missouri at Kansas City, Kansas City, United States of America.

出版信息

PLoS One. 2019 Mar 15;14(3):e0213712. doi: 10.1371/journal.pone.0213712. eCollection 2019.

DOI:10.1371/journal.pone.0213712
PMID:30875409
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6420038/
Abstract

Given the close relationship between protein structure and function, protein structure searches have long played an established role in bioinformatics. Despite their maturity, existing protein structure searches either use simplifying assumptions or compromise between fast response times and quality of results. These limitations can prevent the easy and efficient exploration of relationships between protein structures, which is the norm in other areas of inquiry. To address these limitations we have developed RUPEE, a fast and accurate purely geometric structure search combining techniques from information retrieval and big data with a novel approach to encoding sequences of torsion angles. Comparing our results to the output of mTM, SSM, and the CATHEDRAL structural scan, it is clear that RUPEE has set a new bar for purely geometric big data approaches to protein structure searches. RUPEE in top-aligned mode produces equal or better results than the best available protein structure searches, and RUPEE in fast mode demonstrates the fastest response times coupled with high quality results. The RUPEE protein structure search is available at https://ayoubresearch.com. Code and data are available at https://github.com/rayoub/rupee.

摘要

鉴于蛋白质结构与功能之间的密切关系,蛋白质结构搜索在生物信息学中一直发挥着重要作用。尽管它们已经很成熟,但现有的蛋白质结构搜索要么使用简化的假设,要么在快速响应时间和结果质量之间做出妥协。这些限制可能会阻碍对蛋白质结构之间关系的轻松和高效探索,而这在其他研究领域是很常见的。为了解决这些限制,我们开发了 RUPEE,这是一种快速而准确的纯几何结构搜索,结合了信息检索和大数据技术,以及一种对扭转角序列进行编码的新方法。将我们的结果与 mTM、SSM 和 CATHEDRAL 结构扫描的输出进行比较,可以清楚地看出,RUPEE 为蛋白质结构搜索的纯几何大数据方法设定了一个新的标杆。RUPEE 在最高对齐模式下产生的结果与最好的可用蛋白质结构搜索相同或更好,而 RUPEE 在快速模式下则展示了最快的响应时间和高质量的结果。RUPEE 蛋白质结构搜索可在 https://ayoubresearch.com 上使用。代码和数据可在 https://github.com/rayoub/rupee 上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/4705e3d2133d/pone.0213712.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/ea97936641ef/pone.0213712.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/13573a362edc/pone.0213712.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/5f5571bef081/pone.0213712.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/0e47bbcd72da/pone.0213712.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/db7803ba3c1d/pone.0213712.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/d74ed70d2b3b/pone.0213712.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/4705e3d2133d/pone.0213712.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/ea97936641ef/pone.0213712.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/13573a362edc/pone.0213712.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/5f5571bef081/pone.0213712.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/0e47bbcd72da/pone.0213712.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/db7803ba3c1d/pone.0213712.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/d74ed70d2b3b/pone.0213712.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be2a/6420038/4705e3d2133d/pone.0213712.g007.jpg

相似文献

1
RUPEE: A fast and accurate purely geometric protein structure search.鲁比:一种快速而准确的纯几何蛋白质结构搜索方法。
PLoS One. 2019 Mar 15;14(3):e0213712. doi: 10.1371/journal.pone.0213712. eCollection 2019.
2
Protein structure search to support the development of protein structure prediction methods.支持蛋白质结构预测方法开发的蛋白质结构搜索。
Proteins. 2021 Jun;89(6):648-658. doi: 10.1002/prot.26048. Epub 2021 Feb 2.
3
In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.使用多个搜索引擎和明确的指标对蛋白质推断算法进行深入分析。
J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.
4
SWORD-a highly efficient protein database search.SWORD——一种高效的蛋白质数据库搜索工具。
Bioinformatics. 2016 Sep 1;32(17):i680-i684. doi: 10.1093/bioinformatics/btw445.
5
NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.NetCoffee:一种快速准确的全局比对方法,用于识别多个网络中具有功能保守性的蛋白质。
Bioinformatics. 2014 Feb 15;30(4):540-8. doi: 10.1093/bioinformatics/btt715. Epub 2013 Dec 13.
6
MADOKA: an ultra-fast approach for large-scale protein structure similarity searching.MADOKA:一种用于大规模蛋白质结构相似性搜索的超快速方法。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):662. doi: 10.1186/s12859-019-3235-1.
7
Scatter search algorithm for protein structure prediction.用于蛋白质结构预测的散布搜索算法。
Int J Bioinform Res Appl. 2009;5(5):501-15. doi: 10.1504/IJBRA.2009.028679.
8
Barnaba: software for analysis of nucleic acid structures and trajectories.Barnaba:核酸结构和轨迹分析软件。
RNA. 2019 Feb;25(2):219-231. doi: 10.1261/rna.067678.118. Epub 2018 Nov 12.
9
QSCOP-BLAST--fast retrieval of quantified structural information for protein sequences of unknown structure.QSCOP-BLAST——快速检索未知结构蛋白质序列的量化结构信息。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W411-5. doi: 10.1093/nar/gkm264. Epub 2007 May 3.
10
Composition-based statistics and translated nucleotide searches: improving the TBLASTN module of BLAST.基于组成的统计和翻译后的核苷酸搜索:改进BLAST的TBLASTN模块
BMC Biol. 2006 Dec 7;4:41. doi: 10.1186/1741-7007-4-41.

引用本文的文献

1
Prediction of secreted uncharacterized protein structures from Beauveria bassiana ARSEF 2860 unravels novel toxins-like families.球孢白僵菌ARSEF 2860分泌的未知蛋白质结构预测揭示了新的毒素样家族。
Sci Rep. 2025 May 22;15(1):17747. doi: 10.1038/s41598-025-02618-3.
2
The proteotranscriptomic characterization of venom in the white seafan elucidates the evolution of Octocorallia arsenal.白海扇毒液的蛋白质转录组学特征阐明了八放珊瑚动物武器库的进化。
Open Biol. 2025 Mar;15(3):250015. doi: 10.1098/rsob.250015. Epub 2025 Mar 12.
3
Domainator, a flexible software suite for domain-based annotation and neighborhood analysis, identifies proteins involved in antiviral systems.

本文引用的文献

1
mTM-align: a server for fast protein structure database search and multiple protein structure alignment.mTM-align:一个用于快速蛋白质结构数据库搜索和多蛋白质结构比对的服务器。
Nucleic Acids Res. 2018 Jul 2;46(W1):W380-W386. doi: 10.1093/nar/gky430.
2
The RCSB protein data bank: integrative view of protein, gene and 3D structural information.RCSB蛋白质数据库:蛋白质、基因与三维结构信息的综合视图。
Nucleic Acids Res. 2017 Jan 4;45(D1):D271-D281. doi: 10.1093/nar/gkw1000. Epub 2016 Oct 27.
3
ECOD: an evolutionary classification of protein domains.
Domainator是一个用于基于结构域的注释和邻域分析的灵活软件套件,可识别参与抗病毒系统的蛋白质。
Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkae1175.
4
Sensitive remote homology search by local alignment of small positional embeddings from protein language models.通过蛋白质语言模型的小位置嵌入进行局部比对实现敏感的远程同源性搜索。
Elife. 2024 Mar 15;12:RP91415. doi: 10.7554/eLife.91415.
5
Multi-Omics integration can be used to rescue metabolic information for some of the dark region of the Pseudomonas putida proteome.多组学整合可用于挽救恶臭假单胞菌蛋白质组某些未知区域的代谢信息。
BMC Genomics. 2024 Mar 11;25(1):267. doi: 10.1186/s12864-024-10082-y.
6
Functional domain annotation by structural similarity.基于结构相似性的功能域注释
NAR Genom Bioinform. 2024 Jan 31;6(1):lqae005. doi: 10.1093/nargab/lqae005. eCollection 2024 Mar.
7
The origin and structural evolution of de novo genes in Drosophila.果蝇中从头起源基因的起源与结构演化
Nat Commun. 2024 Jan 27;15(1):810. doi: 10.1038/s41467-024-45028-1.
8
In Silico Evaluation, Phylogenetic Analysis, and Structural Modeling of the Class II Hydrophobin Family from Different Fungal Phytopathogens.不同真菌植物病原体中II类疏水蛋白家族的计算机模拟评估、系统发育分析和结构建模
Microorganisms. 2023 Oct 26;11(11):2632. doi: 10.3390/microorganisms11112632.
9
Structure-Function Analysis of RBP7910: An Editosome Z-Binding Protein in Trypanosomatids.结构功能分析 RBP7910:一种锥虫的 editosome Z 结合蛋白。
Molecules. 2023 Oct 7;28(19):6963. doi: 10.3390/molecules28196963.
10
Identifying and profiling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon.利用 Machaon 识别和分析 SARS-CoV-2 的 Spike 蛋白与其他病毒或宿主蛋白之间的结构相似性。
Commun Biol. 2023 Jul 19;6(1):752. doi: 10.1038/s42003-023-05076-7.
ECOD:蛋白质结构域的进化分类
PLoS Comput Biol. 2014 Dec 4;10(12):e1003926. doi: 10.1371/journal.pcbi.1003926. eCollection 2014 Dec.
4
SCOPe: Structural Classification of Proteins--extended, integrating SCOP and ASTRAL data and classification of new structures.SCOPe:蛋白质结构分类——扩展版,整合了 SCOP 和 ASTRAL 数据以及新结构的分类。
Nucleic Acids Res. 2014 Jan;42(Database issue):D304-9. doi: 10.1093/nar/gkt1240. Epub 2013 Dec 3.
5
A novel method to compare protein structures using local descriptors.一种利用局部描述符比较蛋白质结构的新方法。
BMC Bioinformatics. 2011 Aug 17;12:344. doi: 10.1186/1471-2105-12-344.
6
A fresh look at the Ramachandran plot and the occurrence of standard structures in proteins.重新审视拉马钱德兰图与蛋白质中标准结构的出现情况。
Biomol Concepts. 2010 Oct;1(3-4):271-283. doi: 10.1515/BMC.2010.022.
7
Pre-calculated protein structure alignments at the RCSB PDB website.RCSB PDB 网站上预先计算的蛋白质结构比对。
Bioinformatics. 2010 Dec 1;26(23):2983-5. doi: 10.1093/bioinformatics/btq572. Epub 2010 Oct 10.
8
Dali server: conservation mapping in 3D.大理服务器:三维保护图谱构建。
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W545-9. doi: 10.1093/nar/gkq366. Epub 2010 May 10.
9
How significant is a protein structure similarity with TM-score = 0.5?蛋白质结构相似度 TM 值为 0.5 有多大意义?
Bioinformatics. 2010 Apr 1;26(7):889-95. doi: 10.1093/bioinformatics/btq066. Epub 2010 Feb 17.
10
FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.FragBag 是一种准确表示蛋白质结构的方法,它可以快速准确地从整个 PDB 中检索结构邻居。
Proc Natl Acad Sci U S A. 2010 Feb 23;107(8):3481-6. doi: 10.1073/pnas.0914097107. Epub 2010 Feb 3.