• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

3DFI:一种利用结构同源性推断蛋白质功能的流程。

3DFI: a pipeline to infer protein function using structural homology.

作者信息

Julian Alexander Thomas, Dos Santos Anne Caroline Mascarenhas, Pombert Jean-François

机构信息

Department of Biology, Illinois Institute of Technology, Chicago, IL, USA.

出版信息

Bioinform Adv. 2021;1(1). doi: 10.1093/bioadv/vbab030. Epub 2021 Nov 10.

DOI:10.1093/bioadv/vbab030
PMID:35664289
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9162058/
Abstract

MOTIVATION

Inferring protein function is an integral part of genome annotation and analysis. This process is usually performed , and most inferences are based on sequence homology approaches, which can fail when in presence of divergent sequences. However, because protein structures and their biological roles are intertwined, protein function can also be inferred by searching for structural homology. Many excellent tools have been released in recent years with regards to protein structure prediction, structural homology searches and protein visualization. Unfortunately, these tools are disconnected from each other and often use a web server-based approach that is ill-suited to high-throughput genome-wide analyses. To help assist genome annotation, we built a structural homology-based pipeline called 3DFI (for tridimensional functional inference) leveraging some of the best structural homology tools. This pipeline was built with simplicity of use in mind and enables genome-wide structural homology inferences.

AVAILABILITY AND IMPLEMENTATION

3DFI is available on GitHub https://github.com/PombertLab/3DFI under the permissive MIT license. The pipeline is written in Perl and Python.

摘要

动机

推断蛋白质功能是基因组注释和分析不可或缺的一部分。这个过程通常是通过序列同源性方法来完成的,并且大多数推断都是基于该方法。然而,当存在分歧序列时,这种方法可能会失效。由于蛋白质结构与其生物学功能相互关联,因此也可以通过寻找结构同源性来推断蛋白质功能。近年来,已经发布了许多关于蛋白质结构预测、结构同源性搜索和蛋白质可视化的优秀工具。不幸的是,这些工具彼此之间相互独立,并且通常采用基于网络服务器的方法,这种方法不适用于高通量全基因组分析。为了帮助进行基因组注释,我们利用一些最佳的结构同源性工具构建了一个基于结构同源性的流程,称为3DFI(三维功能推断)。该流程在设计时考虑了使用的简便性,并能够进行全基因组的结构同源性推断。

可用性和实现方式

3DFI可在GitHub上获取,网址为https://github.com/PombertLab/3DFI ,遵循宽松的MIT许可协议。该流程用Perl和Python编写。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ee0/9710623/333469870e35/vbab030f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ee0/9710623/74a457cdec1f/vbab030f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ee0/9710623/333469870e35/vbab030f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ee0/9710623/74a457cdec1f/vbab030f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ee0/9710623/333469870e35/vbab030f2.jpg

相似文献

1
3DFI: a pipeline to infer protein function using structural homology.3DFI:一种利用结构同源性推断蛋白质功能的流程。
Bioinform Adv. 2021;1(1). doi: 10.1093/bioadv/vbab030. Epub 2021 Nov 10.
2
SYNY: a pipeline to investigate and visualize collinearity between genomes.SYNY:一个用于研究和可视化基因组间共线性的流程。
bioRxiv. 2024 May 13:2024.05.09.593317. doi: 10.1101/2024.05.09.593317.
3
A De-Novo Genome Analysis Pipeline (DeNoGAP) for large-scale comparative prokaryotic genomics studies.一种用于大规模比较原核生物基因组学研究的从头基因组分析流程(DeNoGAP)。
BMC Bioinformatics. 2016 Jun 30;17(1):260. doi: 10.1186/s12859-016-1142-2.
4
FunGAP: Fungal Genome Annotation Pipeline using evidence-based gene model evaluation.FunGAP:基于证据的基因模型评估的真菌基因组注释流水线。
Bioinformatics. 2017 Sep 15;33(18):2936-2937. doi: 10.1093/bioinformatics/btx353.
5
DoriTool: A Bioinformatics Integrative Tool for Post-Association Functional Annotation.DoriTool:一种用于关联后功能注释的生物信息学整合工具。
Public Health Genomics. 2017;20(2):126-135. doi: 10.1159/000477561. Epub 2017 Jul 13.
6
SNPAAMapper-Python: A highly efficient genome-wide SNP variant analysis pipeline for Next-Generation Sequencing data.SNPAAMapper-Python:一种用于下一代测序数据的高效全基因组SNP变异分析流程。
Front Artif Intell. 2022 Sep 12;5:991733. doi: 10.3389/frai.2022.991733. eCollection 2022.
7
HAMAP as SPARQL rules-A portable annotation pipeline for genomes and proteomes.HAMAP 作为 SPARQL 规则——一种用于基因组和蛋白质组的可移植注释管道。
Gigascience. 2020 Feb 1;9(2). doi: 10.1093/gigascience/giaa003.
8
DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication.DFAST:一个灵活的原核生物基因组注释管道,用于更快地发布基因组。
Bioinformatics. 2018 Mar 15;34(6):1037-1039. doi: 10.1093/bioinformatics/btx713.
9
AGILE: an assembled genome mining pipeline.AGILE:一个组装基因组挖掘管道。
Bioinformatics. 2019 Apr 1;35(7):1252-1254. doi: 10.1093/bioinformatics/bty781.
10
DeepCoil-a fast and accurate prediction of coiled-coil domains in protein sequences.DeepCoil—一种快速准确预测蛋白质序列中卷曲螺旋结构域的方法。
Bioinformatics. 2019 Aug 15;35(16):2790-2795. doi: 10.1093/bioinformatics/bty1062.

引用本文的文献

1
Application of Protein Structure Encodings and Sequence Embeddings for Transporter Substrate Prediction.蛋白质结构编码和序列嵌入在转运蛋白底物预测中的应用。
Molecules. 2025 Aug 1;30(15):3226. doi: 10.3390/molecules30153226.
2
In Silico Analysis of Protein-Protein Interactions of Putative Endoplasmic Reticulum Metallopeptidase 1 in .内质网金属肽酶1假定蛋白-蛋白相互作用的计算机模拟分析
Curr Issues Mol Biol. 2024 May 12;46(5):4609-4629. doi: 10.3390/cimb46050280.
3
Genomic and phenotypic evolution of nematode-infecting microsporidia.

本文引用的文献

1
Accurate prediction of protein structures and interactions using a three-track neural network.使用三轨神经网络准确预测蛋白质结构和相互作用。
Science. 2021 Aug 20;373(6557):871-876. doi: 10.1126/science.abj8754. Epub 2021 Jul 15.
2
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
3
RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences.
线虫感染的微孢子虫的基因组和表型进化。
PLoS Pathog. 2023 Jul 20;19(7):e1011510. doi: 10.1371/journal.ppat.1011510. eCollection 2023 Jul.
4
Telomere-to-Telomere genome assemblies of human-infecting Encephalitozoon species.人类感染的肠上皮细胞内共生菌属的端粒到端粒基因组组装。
BMC Genomics. 2023 May 4;24(1):237. doi: 10.1186/s12864-023-09331-3.
5
Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra.利用 GenEra 揭示动物、植物和真菌主要进化转变过程中的基因家族起源事件。
Genome Biol. 2023 Mar 24;24(1):54. doi: 10.1186/s13059-023-02895-z.
6
Evolutionary analysis and expression profiling of the gene family in response to abiotic stresses in tomato ().番茄()响应非生物胁迫的基因家族的进化分析和表达谱。
Sci Prog. 2023 Jan-Mar;106(1):368504221148843. doi: 10.1177/00368504221148843.
RCSB 蛋白质数据库:用于基础生物学、生物医学、生物技术、生物工程和能源科学等领域的基础研究、应用研究和教育中探索生物大分子三维结构的强大新工具。
Nucleic Acids Res. 2021 Jan 8;49(D1):D437-D451. doi: 10.1093/nar/gkaa1038.
4
UCSF ChimeraX: Structure visualization for researchers, educators, and developers.UCSF ChimeraX:面向研究人员、教育工作者和开发者的结构可视化工具。
Protein Sci. 2021 Jan;30(1):70-82. doi: 10.1002/pro.3943. Epub 2020 Oct 22.
5
HH-suite3 for fast remote homology detection and deep protein annotation.HH-suite3 用于快速远程同源检测和深度蛋白质注释。
BMC Bioinformatics. 2019 Sep 14;20(1):473. doi: 10.1186/s12859-019-3019-7.
6
Enhanced fold recognition using efficient short fragment clustering.利用高效短片段聚类增强折叠识别。
J Mol Biochem. 2012;1(2):76-85. Epub 2012 Jun 16.
7
The Ordospora colligata genome: Evolution of extreme reduction in microsporidia and host-to-parasite horizontal gene transfer.鄂尔多斯孢虫基因组:微孢子虫极端简化的进化及宿主到寄生虫的水平基因转移
mBio. 2015 Jan 13;6(1):e02400-14. doi: 10.1128/mBio.02400-14.
8
Protein threading using context-specific alignment potential.使用上下文特定对齐势的蛋白质穿线。
Bioinformatics. 2013 Jul 1;29(13):i257-65. doi: 10.1093/bioinformatics/btt210.
9
On the relationship between sequence and structure similarities in proteomics.蛋白质组学中序列相似性与结构相似性之间的关系
Bioinformatics. 2007 Mar 15;23(6):717-23. doi: 10.1093/bioinformatics/btm006. Epub 2007 Jan 22.
10
A large-scale experiment to assess protein structure prediction methods.一项评估蛋白质结构预测方法的大规模实验。
Proteins. 1995 Nov;23(3):ii-v. doi: 10.1002/prot.340230303.