Julian Alexander Thomas, Dos Santos Anne Caroline Mascarenhas, Pombert Jean-François
Department of Biology, Illinois Institute of Technology, Chicago, IL, USA.
Bioinform Adv. 2021;1(1). doi: 10.1093/bioadv/vbab030. Epub 2021 Nov 10.
Inferring protein function is an integral part of genome annotation and analysis. This process is usually performed , and most inferences are based on sequence homology approaches, which can fail when in presence of divergent sequences. However, because protein structures and their biological roles are intertwined, protein function can also be inferred by searching for structural homology. Many excellent tools have been released in recent years with regards to protein structure prediction, structural homology searches and protein visualization. Unfortunately, these tools are disconnected from each other and often use a web server-based approach that is ill-suited to high-throughput genome-wide analyses. To help assist genome annotation, we built a structural homology-based pipeline called 3DFI (for tridimensional functional inference) leveraging some of the best structural homology tools. This pipeline was built with simplicity of use in mind and enables genome-wide structural homology inferences.
3DFI is available on GitHub https://github.com/PombertLab/3DFI under the permissive MIT license. The pipeline is written in Perl and Python.
推断蛋白质功能是基因组注释和分析不可或缺的一部分。这个过程通常是通过序列同源性方法来完成的,并且大多数推断都是基于该方法。然而,当存在分歧序列时,这种方法可能会失效。由于蛋白质结构与其生物学功能相互关联,因此也可以通过寻找结构同源性来推断蛋白质功能。近年来,已经发布了许多关于蛋白质结构预测、结构同源性搜索和蛋白质可视化的优秀工具。不幸的是,这些工具彼此之间相互独立,并且通常采用基于网络服务器的方法,这种方法不适用于高通量全基因组分析。为了帮助进行基因组注释,我们利用一些最佳的结构同源性工具构建了一个基于结构同源性的流程,称为3DFI(三维功能推断)。该流程在设计时考虑了使用的简便性,并能够进行全基因组的结构同源性推断。
3DFI可在GitHub上获取,网址为https://github.com/PombertLab/3DFI ,遵循宽松的MIT许可协议。该流程用Perl和Python编写。