Suppr超能文献

ChSeq:一个变色龙序列数据库。

ChSeq: A database of chameleon sequences.

作者信息

Li Wenlin, Kinch Lisa N, Karplus P Andrew, Grishin Nick V

机构信息

Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050.

Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050.

出版信息

Protein Sci. 2015 Jul;24(7):1075-86. doi: 10.1002/pro.2689. Epub 2015 Jun 16.

Abstract

Chameleon sequences (ChSeqs) refer to sequence strings of identical amino acids that can adopt different conformations in protein structures. Researchers have detected and studied ChSeqs to understand the interplay between local and global interactions in protein structure formation. The different secondary structures adopted by one ChSeq challenge sequence-based secondary structure predictors. With increasing numbers of available Protein Data Bank structures, we here identify a large set of ChSeqs ranging from 6 to 10 residues in length. The homologous ChSeqs discovered highlight the structural plasticity involved in biological function. When compared with previous studies, the set of unrelated ChSeqs found represents an about 20-fold increase in the number of detected sequences, as well as an increase in the longest ChSeq length from 8 to 10 residues. We applied secondary structure predictors on our ChSeqs and found that methods based on a sequence profile outperformed methods based on a single sequence. For the unrelated ChSeqs, the evolutionary information provided by the sequence profile typically allows successful prediction of the prevailing secondary structure adopted in each protein family. Our dataset will facilitate future studies of ChSeqs, as well as interpretations of the interplay between local and nonlocal interactions. A user-friendly web interface for this ChSeq database is available at prodata.swmed.edu/chseq.

摘要

变色龙序列(ChSeqs)是指在蛋白质结构中可呈现不同构象的相同氨基酸序列串。研究人员已对变色龙序列进行了检测和研究,以了解蛋白质结构形成过程中局部和全局相互作用之间的相互关系。一个变色龙序列所采用的不同二级结构对基于序列的二级结构预测器构成了挑战。随着蛋白质数据库结构数量的增加,我们在此识别出了一大组长度在6至10个残基之间的变色龙序列。所发现的同源变色龙序列凸显了生物功能中涉及的结构可塑性。与之前的研究相比,所发现的不相关变色龙序列集在检测到的序列数量上增加了约20倍,同时最长的变色龙序列长度也从8个残基增加到了10个残基。我们将二级结构预测器应用于我们的变色龙序列,发现基于序列概况的方法优于基于单个序列的方法。对于不相关的变色龙序列,序列概况提供的进化信息通常能成功预测每个蛋白质家族中普遍采用的二级结构。我们的数据集将有助于未来对变色龙序列的研究,以及对局部和非局部相互作用之间相互关系的解读。可通过prodata.swmed.edu/chseq访问此变色龙序列数据库的用户友好型网络界面。

相似文献

1
ChSeq: A database of chameleon sequences.ChSeq:一个变色龙序列数据库。
Protein Sci. 2015 Jul;24(7):1075-86. doi: 10.1002/pro.2689. Epub 2015 Jun 16.

引用本文的文献

1
Chameleon sequences-Structural effects.变色龙序列——结构效应
PLoS One. 2025 Apr 22;20(4):e0315901. doi: 10.1371/journal.pone.0315901. eCollection 2025.

本文引用的文献

1
ECOD: an evolutionary classification of protein domains.ECOD:蛋白质结构域的进化分类
PLoS Comput Biol. 2014 Dec 4;10(12):e1003926. doi: 10.1371/journal.pcbi.1003926. eCollection 2014 Dec.
2
CoDNaS: a database of conformational diversity in the native state of proteins.CoDNaS:蛋白质天然构象多样性数据库。
Bioinformatics. 2013 Oct 1;29(19):2512-4. doi: 10.1093/bioinformatics/btt405. Epub 2013 Jul 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验