• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

预测的内在无序蛋白Pfam结构域的分布与聚类分析

Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains.

作者信息

Williams Robert W, Xue Bin, Uversky Vladimir N, Dunker A Keith

机构信息

Department of Biomedical Informatics; Uniformed Services University; Bethesda, MD USA.

Center for Computational Biology and Bioinformatics; Indiana School of Medicine; Indianapolis, IN USA.

出版信息

Intrinsically Disord Proteins. 2013 Apr 1;1(1):e25724. doi: 10.4161/idp.25724. eCollection 2013 Jan-Dec.

DOI:10.4161/idp.25724
PMID:28516017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5424788/
Abstract

The Pfam database groups regions of proteins by how well hidden Markov models (HMMs) can be trained to recognize similarities among them. Conservation pressure is probably in play here. The Pfam seed training set includes sequence and structure information, being drawn largely from the PDB. A long standing hypothesis among intrinsically disordered protein (IDP) investigators has held that conservation pressures are also at play in the evolution of different kinds of intrinsic disorder, but we find that predicted intrinsic disorder (PID) is not always conserved across Pfam domains. Here we analyze distributions and clusters of PID regions in 193024 members of the version 23.0 Pfam seed database. To include the maximum information available for proteins that remain unfolded in solution, we employ the 10 linearly independent Kidera factors for the amino acids, combined with PONDR predictions of disorder tendency, to transform the sequences of these Pfam members into an 11 column matrix where the number of rows is the length of each Pfam region. Cluster analyses of the set of all regions, including those that are folded, show 6 groupings of domains. Cluster analyses of domains with mean VSL2b scores greater than 0.5 (half predicted disorder or more) show at least 3 separated groups. It is hypothesized that grouping sets into shorter sequences with more uniform length will reveal more information about intrinsic disorder and lead to more finely structured and perhaps more accurate predictions. HMMs could be trained to include this information.

摘要

Pfam数据库根据隐马尔可夫模型(HMM)训练识别蛋白质区域间相似性的效果对蛋白质区域进行分组。这里可能存在保守压力。Pfam种子训练集包含序列和结构信息,主要取自蛋白质数据银行(PDB)。在内在无序蛋白质(IDP)研究人员中,长期存在的一种假设认为,保守压力在不同类型内在无序的进化中也起作用,但我们发现预测的内在无序(PID)在Pfam结构域中并非总是保守的。在此,我们分析了Pfam种子数据库23.0版本中193024个成员的PID区域的分布和聚类情况。为了纳入溶液中仍未折叠的蛋白质的最大可用信息,我们使用氨基酸的10个线性独立的基德拉伸因子,结合无序倾向的PONDR预测,将这些Pfam成员的序列转换为一个11列矩阵,其中行数为每个Pfam区域的长度。对所有区域(包括折叠区域)的聚类分析显示有6个结构域分组。对平均VSL2b分数大于0.5(一半或更多为预测无序)的结构域进行聚类分析显示至少有3个分离的组。据推测,将分组集分成长度更均匀的较短序列将揭示更多关于内在无序的信息,并导致更精细的结构以及可能更准确的预测。可以训练HMM来纳入此信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/a571ddcaa1dc/kidp-01-01-10925724-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/6a2304157c99/kidp-01-01-10925724-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/bfad8f2bbcfa/kidp-01-01-10925724-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/b404e767fc45/kidp-01-01-10925724-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/a94da1b93444/kidp-01-01-10925724-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/699fb6e9ef31/kidp-01-01-10925724-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/8ea3eb2dd789/kidp-01-01-10925724-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/a571ddcaa1dc/kidp-01-01-10925724-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/6a2304157c99/kidp-01-01-10925724-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/bfad8f2bbcfa/kidp-01-01-10925724-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/b404e767fc45/kidp-01-01-10925724-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/a94da1b93444/kidp-01-01-10925724-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/699fb6e9ef31/kidp-01-01-10925724-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/8ea3eb2dd789/kidp-01-01-10925724-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cca6/5424788/a571ddcaa1dc/kidp-01-01-10925724-g008.jpg

相似文献

1
Distribution and cluster analysis of predicted intrinsically disordered protein Pfam domains.预测的内在无序蛋白Pfam结构域的分布与聚类分析
Intrinsically Disord Proteins. 2013 Apr 1;1(1):e25724. doi: 10.4161/idp.25724. eCollection 2013 Jan-Dec.
2
Intrinsically disordered domains: Sequence ➔ disorder ➔ function relationships.无规则结构域:序列 ➔ 无序 ➔ 功能关系。
Protein Sci. 2019 Sep;28(9):1652-1663. doi: 10.1002/pro.3680. Epub 2019 Aug 9.
3
Intrinsic disorder in the Protein Data Bank.蛋白质数据库中的内在无序状态。
J Biomol Struct Dyn. 2007 Feb;24(4):325-42. doi: 10.1080/07391102.2007.10507123.
4
Pfam: a comprehensive database of protein domain families based on seed alignments.Pfam:一个基于种子比对的蛋白质结构域家族综合数据库。
Proteins. 1997 Jul;28(3):405-20. doi: 10.1002/(sici)1097-0134(199707)28:3<405::aid-prot10>3.0.co;2-l.
5
MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins.MobiDB 2.0:一个关于内在无序和可移动蛋白质的改进数据库。
Nucleic Acids Res. 2015 Jan;43(Database issue):D315-20. doi: 10.1093/nar/gku982. Epub 2014 Oct 31.
6
Detection of orphan domains in Drosophila using "hydrophobic cluster analysis".利用“疏水簇分析”检测果蝇中的孤儿结构域
Biochimie. 2015 Dec;119:244-53. doi: 10.1016/j.biochi.2015.02.019. Epub 2015 Feb 28.
7
Bioinformatic Identification of Rare Codon Clusters (RCCs) in HBV Genome and Evaluation of RCCs in Proteins Structure of Hepatitis B Virus.乙肝病毒基因组中稀有密码子簇(RCCs)的生物信息学鉴定及乙肝病毒蛋白质结构中RCCs的评估
Hepat Mon. 2016 Oct 4;16(10):e39909. doi: 10.5812/hepatmon.39909. eCollection 2016 Oct.
8
Intrinsically disordered domains deviate significantly from random sequences in mammalian proteins.无规则结构域在哺乳动物蛋白中与随机序列有明显差异。
BMC Bioinformatics. 2010 Oct 15;11 Suppl 7(Suppl 7):S7. doi: 10.1186/1471-2105-11-S7-S7.
9
The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.博洛尼亚注释资源(BAR 3.0):改进蛋白质功能注释。
Nucleic Acids Res. 2017 Jul 3;45(W1):W285-W290. doi: 10.1093/nar/gkx330.
10
Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins.Pfam 3.1:1313个多重比对和隐马尔可夫模型概况与大多数蛋白质匹配。
Nucleic Acids Res. 1999 Jan 1;27(1):260-2. doi: 10.1093/nar/27.1.260.

引用本文的文献

1
Intrinsic Disorder in the Human Tear Proteome.人眼分泌物蛋白质组内的固有无序区。
Invest Ophthalmol Vis Sci. 2023 Aug 1;64(11):14. doi: 10.1167/iovs.64.11.14.
2
Digging into the 3D Structure Predictions of AlphaFold2 with Low Confidence: Disorder and Beyond.深入挖掘低置信度的 AlphaFold2 的 3D 结构预测:无序与超越。
Biomolecules. 2022 Oct 13;12(10):1467. doi: 10.3390/biom12101467.
3
Structural and Functional Characterization of the ABA-Water Deficit Stress Domain from Wheat and Barley: An Intrinsically Disordered Domain behind the Versatile Functions of the Plant Abscissic Acid, Stress and Ripening Protein Family.

本文引用的文献

1
Utilization of protein intrinsic disorder knowledge in structural proteomics.结构蛋白质组学中蛋白质内在无序知识的应用。
Biochim Biophys Acta. 2013 Feb;1834(2):487-98. doi: 10.1016/j.bbapap.2012.12.003. Epub 2012 Dec 8.
2
D²P²: database of disordered protein predictions.D²P²:紊乱蛋白预测数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D508-16. doi: 10.1093/nar/gks1226. Epub 2012 Nov 29.
3
Beyond supersecondary structure: the global properties of protein sequences.
小麦和大麦 ABA-水分亏缺应激域的结构和功能特征:植物脱落酸、应激和成熟蛋白家族多功能背后的固有无序域。
Int J Mol Sci. 2021 Feb 26;22(5):2314. doi: 10.3390/ijms22052314.
4
Intrinsically disordered domains: Sequence ➔ disorder ➔ function relationships.无规则结构域:序列 ➔ 无序 ➔ 功能关系。
Protein Sci. 2019 Sep;28(9):1652-1663. doi: 10.1002/pro.3680. Epub 2019 Aug 9.
5
Polymorphism Analysis Reveals Reduced Negative Selection and Elevated Rate of Insertions and Deletions in Intrinsically Disordered Protein Regions.多态性分析显示,内在无序蛋白质区域的负选择减少,插入和缺失率升高。
Genome Biol Evol. 2015 Jun 4;7(6):1815-26. doi: 10.1093/gbe/evv105.
6
Prediction of protein structural features from sequence data based on Shannon entropy and Kolmogorov complexity.基于香农熵和柯尔莫哥洛夫复杂度从序列数据预测蛋白质结构特征。
PLoS One. 2015 Apr 9;10(4):e0119306. doi: 10.1371/journal.pone.0119306. eCollection 2015.
7
Identifying novel cell cycle proteins in Apicomplexa parasites through co-expression decision analysis.通过共表达决策分析鉴定顶复门寄生虫中的新型细胞周期蛋白。
PLoS One. 2014 May 19;9(5):e97625. doi: 10.1371/journal.pone.0097625. eCollection 2014.
Methods Mol Biol. 2013;932:107-14. doi: 10.1007/978-1-62703-065-6_7.
4
Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life.蛋白质无规卷曲分布的有序性:病毒和生命三界 3500 个蛋白质组中的无规卷曲。
J Biomol Struct Dyn. 2012;30(2):137-49. doi: 10.1080/07391102.2012.675145.
5
MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins.MoRFpred,一种基于序列的计算工具,用于预测和描述蛋白质中短的无序到有序转变的结合区域。
Bioinformatics. 2012 Jun 15;28(12):i75-83. doi: 10.1093/bioinformatics/bts209.
6
Multiparametric analysis of intrinsically disordered proteins: looking at intrinsic disorder through compound eyes.多参数分析无规卷曲蛋白质:透过复眼观察无序结构
Anal Chem. 2012 Mar 6;84(5):2096-104. doi: 10.1021/ac203096k. Epub 2012 Jan 19.
7
Subclassifying disordered proteins by the CH-CDF plot method.通过CH-CDF图法对无序蛋白质进行亚分类。
Pac Symp Biocomput. 2012:128-39.
8
Comprehensive comparative assessment of in-silico predictors of disordered regions.计算预测无序区域的生物信息学方法的综合比较评估
Curr Protein Pept Sci. 2012 Feb;13(1):6-18. doi: 10.2174/138920312799277938.
9
Systematic analysis of tropomodulin/tropomyosin interactions uncovers fine-tuned binding specificity of intrinsically disordered proteins.系统分析原肌球蛋白/原肌球蛋白相互作用揭示了无规卷曲蛋白的精细结合特异性。
J Mol Recognit. 2011 Jul-Aug;24(4):647-55. doi: 10.1002/jmr.1093.
10
On the information content of protein sequences.
J Biomol Struct Dyn. 2011 Feb;28(4):593-4; discussion 669-674. doi: 10.1080/073911011010524957.