Suppr超能文献

从序列信息中揭示螺线管蛋白的非周期性特征。

Revealing aperiodic aspects of solenoid proteins from sequence information.

作者信息

Hrabe Thomas, Jaroszewski Lukasz, Godzik Adam

机构信息

Department of Bioinformatics and Systems Biology, Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA.

出版信息

Bioinformatics. 2016 Sep 15;32(18):2776-82. doi: 10.1093/bioinformatics/btw319. Epub 2016 Jun 9.

Abstract

MOTIVATION

Repeat proteins, which contain multiple repeats of short sequence motifs, form a large but seldom-studied group of proteins. Methods focusing on the analysis of 3D structures of such proteins identified many subtle effects in length distribution of individual motifs that are important for their functions. However, similar analysis was yet not applied to the vast majority of repeat proteins with unknown 3D structures, mostly because of the extreme diversity of the underlying motifs and the resulting difficulty to detect those.

RESULTS

We developed FAIT, a sequence-based algorithm for the precise assignment of individual repeats in repeat proteins and introduced a framework to classify and compare aperiodicity patterns for large protein families. FAIT extracts repeat positions by post-processing FFAS alignment matrices with image processing methods. On examples of proteins with Leucine Rich Repeat (LRR) domains and other solenoids like proteins, we show that the automated analysis with FAIT correctly identifies exact lengths of individual repeats based entirely on sequence information.

AVAILABILITY AND IMPLEMENTATION

https://github.com/GodzikLab/FAIT CONTACT: adam@godziklab.org

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

重复蛋白包含短序列基序的多个重复,构成了一大类但很少被研究的蛋白质。专注于此类蛋白质三维结构分析的方法在单个基序的长度分布中发现了许多对其功能很重要的细微效应。然而,类似的分析尚未应用于绝大多数三维结构未知的重复蛋白,主要是因为潜在基序的极度多样性以及由此导致的难以检测到这些基序。

结果

我们开发了FAIT,一种基于序列的算法,用于精确分配重复蛋白中的各个重复,并引入了一个框架来对大型蛋白质家族的非周期性模式进行分类和比较。FAIT通过使用图像处理方法对FFAS比对矩阵进行后处理来提取重复位置。在富含亮氨酸重复(LRR)结构域的蛋白质以及其他类似螺线管结构的蛋白质的例子中,我们表明使用FAIT进行的自动分析完全基于序列信息就能正确识别各个重复的准确长度。

可用性与实现方式

https://github.com/GodzikLab/FAIT 联系方式:adam@godziklab.org

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
Revealing aperiodic aspects of solenoid proteins from sequence information.从序列信息中揭示螺线管蛋白的非周期性特征。
Bioinformatics. 2016 Sep 15;32(18):2776-82. doi: 10.1093/bioinformatics/btw319. Epub 2016 Jun 9.

本文引用的文献

1
PDBFlex: exploring flexibility in protein structures.PDBFlex:探索蛋白质结构的灵活性。
Nucleic Acids Res. 2016 Jan 4;44(D1):D423-8. doi: 10.1093/nar/gkv1316. Epub 2015 Nov 28.
2
Designs on a curve.曲线上的图案。
Nat Struct Mol Biol. 2015 Feb;22(2):103-5. doi: 10.1038/nsmb.2966.
3
Control of repeat-protein curvature by computational protein design.通过计算蛋白质设计控制重复蛋白的曲率
Nat Struct Mol Biol. 2015 Feb;22(2):167-74. doi: 10.1038/nsmb.2938. Epub 2015 Jan 12.
6
RepeatsDB: a database of tandem repeat protein structures.RepeatsDB:串联重复蛋白结构数据库。
Nucleic Acids Res. 2014 Jan;42(Database issue):D352-7. doi: 10.1093/nar/gkt1175. Epub 2013 Dec 5.
9
Understanding and identifying amino acid repeats.理解和识别氨基酸重复序列。
Brief Bioinform. 2014 Jul;15(4):582-91. doi: 10.1093/bib/bbt003.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验