• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

fLPS:蛋白质宇宙组成偏差的快速发现

fLPS: Fast discovery of compositional biases for the protein universe.

作者信息

Harrison Paul M

机构信息

Department of Biology, McGill University, Montreal, QC, Canada.

出版信息

BMC Bioinformatics. 2017 Nov 13;18(1):476. doi: 10.1186/s12859-017-1906-3.

DOI:10.1186/s12859-017-1906-3
PMID:29132292
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5684748/
Abstract

BACKGROUND

Proteins often contain regions that are compositionally biased (CB), i.e., they are made from a small subset of amino-acid residue types. These CB regions can be functionally important, e.g., the prion-forming and prion-like regions that are rich in asparagine and glutamine residues.

RESULTS

Here I report a new program fLPS that can rapidly annotate CB regions. It discovers both single-residue and multiple-residue biases. It works through a process of probability minimization. First, contigs are constructed for each amino-acid type out of sequence windows with a low degree of bias; second, these contigs are searched exhaustively for low-probability subsequences (LPSs); third, such LPSs are iteratively assessed for merger into possible multiple-residue biases. At each of these stages, efficiency measures are taken to avoid or delay probability calculations unless/until they are necessary. On a current desktop workstation, the fLPS algorithm can annotate the biased regions of the yeast proteome (>5700 sequences) in <1 s, and of the whole current TrEMBL database (>65 million sequences) in as little as ~1 h, which is >2 times faster than the commonly used program SEG, using default parameters. fLPS discovers both shorter CB regions (of the sort that are often termed 'low-complexity sequence'), and milder biases that may only be detectable over long tracts of sequence.

CONCLUSIONS

fLPS can readily handle very large protein data sets, such as might come from metagenomics projects. It is useful in searching for proteins with similar CB regions, and for making functional inferences about CB regions for a protein of interest. The fLPS package is available from: http://biology.mcgill.ca/faculty/harrison/flps.html , or https://github.com/pmharrison/flps , or is a supplement to this article.

摘要

背景

蛋白质通常包含组成性偏向(CB)区域,即它们由一小部分氨基酸残基类型组成。这些CB区域可能具有重要功能,例如富含天冬酰胺和谷氨酰胺残基的朊病毒形成区域和类朊病毒区域。

结果

在此,我报告了一个新程序fLPS,它可以快速注释CB区域。它能发现单残基和多残基偏向。它通过概率最小化过程工作。首先,针对每个氨基酸类型,从具有低偏向程度的序列窗口构建重叠群;其次,对这些重叠群进行穷举搜索以寻找低概率子序列(LPS);第三,对这些LPS进行迭代评估,以合并成可能的多残基偏向。在这些阶段的每一步,都采取了效率措施,以避免或延迟概率计算,除非/直到有必要进行计算。在当前的桌面工作站上,fLPS算法可以在不到1秒的时间内注释酵母蛋白质组(>5700个序列)的偏向区域,对于整个当前的TrEMBL数据库(>6500万个序列),使用默认参数时,只需约1小时,这比常用程序SEG快2倍以上。fLPS既可以发现较短的CB区域(通常称为“低复杂性序列”),也可以发现可能仅在长序列片段中才能检测到的较温和的偏向。

结论

fLPS可以轻松处理非常大的蛋白质数据集,例如可能来自宏基因组学项目的数据集。它有助于搜索具有相似CB区域的蛋白质,并对感兴趣蛋白质的CB区域进行功能推断。fLPS软件包可从以下网址获取:http://biology.mcgill.ca/faculty/harrison/flps.html ,或https://github.com/pmharrison/flps ,或者是本文的补充内容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/003f82f38fbb/12859_2017_1906_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/4d5d07e2251a/12859_2017_1906_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/12aedef3a6a4/12859_2017_1906_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/79323d44713d/12859_2017_1906_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/94ed1a453f2f/12859_2017_1906_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/740610862e1d/12859_2017_1906_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/003f82f38fbb/12859_2017_1906_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/4d5d07e2251a/12859_2017_1906_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/12aedef3a6a4/12859_2017_1906_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/79323d44713d/12859_2017_1906_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/94ed1a453f2f/12859_2017_1906_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/740610862e1d/12859_2017_1906_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d944/5684748/003f82f38fbb/12859_2017_1906_Fig6_HTML.jpg

相似文献

1
fLPS: Fast discovery of compositional biases for the protein universe.fLPS:蛋白质宇宙组成偏差的快速发现
BMC Bioinformatics. 2017 Nov 13;18(1):476. doi: 10.1186/s12859-017-1906-3.
2
fLPS 2.0: rapid annotation of compositionally-biased regions in biological sequences.fLPS 2.0:生物序列中组成性偏向区域的快速注释
PeerJ. 2021 Oct 28;9:e12363. doi: 10.7717/peerj.12363. eCollection 2021.
3
LPS-annotate: complete annotation of compositionally biased regions in the protein knowledgebase.LPS-annotate:对蛋白质知识库中组成性偏倚区域进行完整注释。
Database (Oxford). 2011 Jan 6;2011:baq031. doi: 10.1093/database/baq031. Print 2011.
4
Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila.成分偏差的详尽分析揭示了普遍存在的偏差区域:人类和果蝇功能关联分析
BMC Bioinformatics. 2006 Oct 10;7:441. doi: 10.1186/1471-2105-7-441.
5
Compositionally Biased Dark Matter in the Protein Universe.蛋白质宇宙中的成分偏向暗物质。
Proteomics. 2018 Nov;18(21-22):e1800069. doi: 10.1002/pmic.201800069. Epub 2018 Oct 29.
6
A method to assess compositional bias in biological sequences and its application to prion-like glutamine/asparagine-rich domains in eukaryotic proteomes.一种评估生物序列中组成性偏差的方法及其在真核生物蛋白质组中类朊病毒谷氨酰胺/天冬酰胺富集结构域的应用。
Genome Biol. 2003;4(6):R40. doi: 10.1186/gb-2003-4-6-r40. Epub 2003 May 30.
7
Optimizing strategy for the discovery of compositionally-biased or low-complexity regions in proteins.优化发现蛋白质中组成偏向或低复杂度区域的策略。
Sci Rep. 2024 Jan 5;14(1):680. doi: 10.1038/s41598-023-50991-8.
8
A bioinformatics method for identifying Q/N-rich prion-like domains in proteins.一种用于识别蛋白质中富含Q/N的朊病毒样结构域的生物信息学方法。
Methods Mol Biol. 2013;1017:219-28. doi: 10.1007/978-1-62703-438-8_16.
9
ProBias: a web-server for the identification of user-specified types of compositionally biased segments in protein sequences.ProBias:一个用于识别蛋白质序列中用户指定类型的组成性偏向片段的网络服务器。
Bioinformatics. 2008 Jul 1;24(13):1534-5. doi: 10.1093/bioinformatics/btn233. Epub 2008 May 14.
10
A Novel algorithm for identifying low-complexity regions in a protein sequence.一种用于识别蛋白质序列中低复杂度区域的新型算法。
Bioinformatics. 2006 Dec 15;22(24):2980-7. doi: 10.1093/bioinformatics/btl495. Epub 2006 Oct 2.

引用本文的文献

1
Intrinsically Disordered Compositional Bias in Proteins: Sequence Traits, Region Clustering, and Generation of Hypothetical Functional Associations.蛋白质中内在无序的组成偏向性:序列特征、区域聚类及假设功能关联的生成
Bioinform Biol Insights. 2024 Oct 15;18:11779322241287485. doi: 10.1177/11779322241287485. eCollection 2024.
2
Identification of Low-Complexity Domains by Compositional Signatures Reveals Class-Specific Frequencies and Functions Across the Domains of Life.通过组成特征鉴定低复杂度结构域揭示了生命领域中特定类别出现的频率和功能。
PLoS Comput Biol. 2024 May 15;20(5):e1011372. doi: 10.1371/journal.pcbi.1011372. eCollection 2024 May.
3

本文引用的文献

1
Analysis of Small Critical Regions of Swi1 Conferring Prion Formation, Maintenance, and Transmission.对Swi1中赋予朊病毒形成、维持和传播功能的小关键区域的分析。
Mol Cell Biol. 2017 Sep 26;37(20). doi: 10.1128/MCB.00206-17. Print 2017 Oct 15.
2
The evolutionary scope and neurological disease linkage of yeast-prion-like proteins in humans.人类中酵母朊病毒样蛋白的进化范围与神经疾病关联
Biol Direct. 2016 Jul 26;11:32. doi: 10.1186/s13062-016-0134-5.
3
Emergence and evolution of yeast prion and prion-like proteins.酵母朊病毒和类朊病毒蛋白的出现与进化。
Optimizing strategy for the discovery of compositionally-biased or low-complexity regions in proteins.
优化发现蛋白质中组成偏向或低复杂度区域的策略。
Sci Rep. 2024 Jan 5;14(1):680. doi: 10.1038/s41598-023-50991-8.
4
Bioinformatics tools for the sequence complexity estimates.用于序列复杂性估计的生物信息学工具。
Biophys Rev. 2023 Sep 15;15(5):1367-1378. doi: 10.1007/s12551-023-01140-y. eCollection 2023 Oct.
5
Feature architecture aware phylogenetic profiling indicates a functional diversification of type IVa pili in the nosocomial pathogen Acinetobacter baumannii.特征结构感知系统发育分析表明,医院病原体鲍曼不动杆菌的 IVa 型菌毛在功能上呈现多样化。
PLoS Genet. 2023 Jul 27;19(7):e1010646. doi: 10.1371/journal.pgen.1010646. eCollection 2023 Jul.
6
A joint proteomic and genomic investigation provides insights into the mechanism of calcification in coccolithophores.联合蛋白质组学和基因组学研究为颗石藻钙化机制提供了新见解。
Nat Commun. 2023 Jun 23;14(1):3749. doi: 10.1038/s41467-023-39336-1.
7
FAS: assessing the similarity between proteins using multi-layered feature architectures.FAS:使用多层特征架构评估蛋白质之间的相似性。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad226.
8
Compensatory relationship between low-complexity regions and gene paralogy in the evolution of prokaryotes.低复杂度区域与原核生物进化中的基因旁系同源性之间的补偿关系。
Proc Natl Acad Sci U S A. 2023 Apr 18;120(16):e2300154120. doi: 10.1073/pnas.2300154120. Epub 2023 Apr 10.
9
Evolution of sequence traits of prion-like proteins linked to amyotrophic lateral sclerosis (ALS).与肌萎缩侧索硬化症(ALS)相关的朊病毒样蛋白序列特征的演变。
PeerJ. 2022 Nov 17;10:e14417. doi: 10.7717/peerj.14417. eCollection 2022.
10
A unified view of low complexity regions (LCRs) across species.跨物种的低复杂度区域(LCRs)的统一视图。
Elife. 2022 Sep 13;11:e77058. doi: 10.7554/eLife.77058.
BMC Evol Biol. 2016 Jan 25;16:24. doi: 10.1186/s12862-016-0594-3.
4
Classifying prion and prion-like phenomena.朊病毒及类朊病毒现象的分类
Prion. 2014 Mar-Apr;8(2):161-5. doi: 10.4161/pri.27960. Epub 2014 Feb 18.
5
The [RNQ+] prion: a model of both functional and pathological amyloid.[RNQ+] 朊病毒:功能性和病理性淀粉样蛋白的模型。
Prion. 2011 Oct-Dec;5(4):291-8. doi: 10.4161/pri.18213. Epub 2011 Oct 1.
6
LPS-annotate: complete annotation of compositionally biased regions in the protein knowledgebase.LPS-annotate:对蛋白质知识库中组成性偏倚区域进行完整注释。
Database (Oxford). 2011 Jan 6;2011:baq031. doi: 10.1093/database/baq031. Print 2011.
7
Distinct subregions of Swi1 manifest striking differences in prion transmission and SWI/SNF function.Swi1 的不同亚区在朊病毒传播和 SWI/SNF 功能方面表现出显著差异。
Mol Cell Biol. 2010 Oct;30(19):4644-55. doi: 10.1128/MCB.00225-10. Epub 2010 Aug 2.
8
The yeast global transcriptional co-repressor protein Cyc8 can propagate as a prion.酵母全局转录共抑制蛋白Cyc8可以作为一种朊病毒进行传播。
Nat Cell Biol. 2009 Mar;11(3):344-9. doi: 10.1038/ncb1843. Epub 2009 Feb 15.
9
Evolution of budding yeast prion-determinant sequences across diverse fungi.不同真菌中出芽酵母朊病毒决定簇序列的进化
J Mol Biol. 2007 Apr 20;368(1):273-82. doi: 10.1016/j.jmb.2007.01.070. Epub 2007 Feb 3.
10
Exhaustive assignment of compositional bias reveals universally prevalent biased regions: analysis of functional associations in human and Drosophila.成分偏差的详尽分析揭示了普遍存在的偏差区域:人类和果蝇功能关联分析
BMC Bioinformatics. 2006 Oct 10;7:441. doi: 10.1186/1471-2105-7-441.