RepSeq——一个存在于低等真核病原体中的氨基酸重复序列数据库。

RepSeq--a database of amino acid repeats present in lower eukaryotic pathogens.

作者信息

Depledge Daniel P, Lower Ryan P J, Smith Deborah F

机构信息

Immunology and Infection Unit, Department of Biology, University of York, Heslington, York, UK.

出版信息

BMC Bioinformatics. 2007 Apr 11;8:122. doi: 10.1186/1471-2105-8-122.

DOI:10.1186/1471-2105-8-122

PMID:17428323

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1854910/

Abstract

BACKGROUND

Amino acid repeat-containing proteins have a broad range of functions and their identification is of relevance to many experimental biologists. In human-infective protozoan parasites (such as the Kinetoplastid and Plasmodium species), they are implicated in immune evasion and have been shown to influence virulence and pathogenicity. RepSeq http://repseq.gugbe.com is a new database of amino acid repeat-containing proteins found in lower eukaryotic pathogens. The RepSeq database is accessed via a web-based application which also provides links to related online tools and databases for further analyses.

RESULTS

The RepSeq algorithm typically identifies more than 98% of repeat-containing proteins and is capable of identifying both perfect and mismatch repeats. The proportion of proteins that contain repeat elements varies greatly between different families and even species (3-35% of the total protein content). The most common motif type is the Sequence Repeat Region (SRR)--a repeated motif containing multiple different amino acid types. Proteins containing Single Amino Acid Repeats (SAARs) and Di-Peptide Repeats (DPRs) typically account for 0.5-1.0% of the total protein number. Notable exceptions are P. falciparum and D. discoideum, in which 33.67% and 34.28% respectively of the predicted proteomes consist of repeat-containing proteins. These numbers are due to large insertions of low complexity single and multi-codon repeat regions.

CONCLUSION

The RepSeq database provides a repository for repeat-containing proteins found in parasitic protozoa. The database allows for both individual and cross-species proteome analyses and also allows users to upload sequences of interest for analysis by the RepSeq algorithm. Identification of repeat-containing proteins provides researchers with a defined subset of proteins which can be analysed by expression profiling and functional characterisation, thereby facilitating study of pathogenicity and virulence factors in the parasitic protozoa. While primarily designed for kinetoplastid work, the RepSeq algorithm and database retain full functionality when used to analyse other species.

摘要

背景

含氨基酸重复序列的蛋白质具有广泛的功能，其鉴定对许多实验生物学家而言具有重要意义。在人类感染性原生动物寄生虫（如动质体和疟原虫物种）中，它们与免疫逃避有关，并已被证明会影响毒力和致病性。RepSeq（http://repseq.gugbe.com）是一个新的数据库，收录了在低等真核病原体中发现的含氨基酸重复序列的蛋白质。可通过基于网络的应用程序访问RepSeq数据库，该应用程序还提供指向相关在线工具和数据库的链接，以便进行进一步分析。

结果

RepSeq算法通常能识别超过98%的含重复序列的蛋白质，并且能够识别完美重复序列和错配重复序列。含重复元件的蛋白质比例在不同家族甚至物种之间差异很大（占总蛋白质含量的3 - 35%）。最常见的基序类型是序列重复区域（SRR）——一种包含多种不同氨基酸类型的重复基序。含单氨基酸重复序列（SAARs）和二肽重复序列（DPRs）的蛋白质通常占蛋白质总数的0.5 - 1.0%。值得注意的例外是恶性疟原虫和盘基网柄菌，其预测蛋白质组中分别有33.67%和34.28%由含重复序列的蛋白质组成。这些数字是由于低复杂性单密码子和多密码子重复区域的大量插入所致。

结论

RepSeq数据库为寄生原生动物中发现的含重复序列的蛋白质提供了一个储存库。该数据库允许进行个体和跨物种蛋白质组分析，还允许用户上传感兴趣的序列，以便通过RepSeq算法进行分析。含重复序列的蛋白质的鉴定为研究人员提供了一组特定的蛋白质，可通过表达谱分析和功能表征进行分析，从而有助于研究寄生原生动物中的致病性和毒力因子。虽然RepSeq算法和数据库主要设计用于动质体研究，但用于分析其他物种时仍保留全部功能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/751f/1854910/d3bda3dd6f48/1471-2105-8-122-1.jpg

相似文献

RepSeq--a database of amino acid repeats present in lower eukaryotic pathogens.RepSeq——一个存在于低等真核病原体中的氨基酸重复序列数据库。

BMC Bioinformatics. 2007 Apr 11;8:122. doi: 10.1186/1471-2105-8-122.

ProtRepeatsDB: a database of amino acid repeats in genomes.ProtRepeatsDB：基因组中氨基酸重复序列数据库。

BMC Bioinformatics. 2006 Jul 7;7:336. doi: 10.1186/1471-2105-7-336.

ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins.ProRepeat：一个用于研究蛋白质中氨基酸串联重复的综合数据库。

Nucleic Acids Res. 2012 Jan;40(Database issue):D394-9. doi: 10.1093/nar/gkr1019. Epub 2011 Nov 18.

MannDB - a microbial database of automated protein sequence analyses and evidence integration for protein characterization.MannDB - 一个用于蛋白质表征的自动蛋白质序列分析和证据整合的微生物数据库。

BMC Bioinformatics. 2006 Oct 17;7:459. doi: 10.1186/1471-2105-7-459.

Repeat-enriched proteins are related to host cell invasion and immune evasion in parasitic protozoa.富含重复序列的蛋白与寄生原生动物的宿主细胞入侵和免疫逃避有关。

Mol Biol Evol. 2013 Apr;30(4):951-63. doi: 10.1093/molbev/mst001. Epub 2013 Jan 8.

COPASAAR--a database for proteomic analysis of single amino acid repeats.COPASAAR——一个用于单氨基酸重复序列蛋白质组学分析的数据库。

BMC Bioinformatics. 2005 Aug 3;6:196. doi: 10.1186/1471-2105-6-196.

Single Amino Acid Repeats in the Proteome World: Structural, Functional, and Evolutionary Insights.蛋白质组世界中的单氨基酸重复序列：结构、功能及进化见解

PLoS One. 2016 Nov 28;11(11):e0166854. doi: 10.1371/journal.pone.0166854. eCollection 2016.

HRaP: database of occurrence of HomoRepeats and patterns in proteomes.HRaP：同源重复和蛋白质组模式出现数据库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D273-8. doi: 10.1093/nar/gkt927. Epub 2013 Oct 22.

The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases.蛋白质标识符交叉引用（PICR）服务：协调多个源数据库中的蛋白质标识符。

BMC Bioinformatics. 2007 Oct 18;8:401. doi: 10.1186/1471-2105-8-401.

FCP: functional coverage of the proteome by structures.FCP：蛋白质组在结构方面的功能覆盖度

Bioinformatics. 2006 Jul 15;22(14):1792-3. doi: 10.1093/bioinformatics/btl188. Epub 2006 May 16.

引用本文的文献

The increased presence of repetitive motifs in the KDDR-plus recombinant protein, a kinesin-derived antigen from Leishmania infantum, improves the diagnostic performance of serological tests for human and canine visceral leishmaniasis.KDDR-plus 重组蛋白中重复基序的增加，这种蛋白来源于婴儿利什曼原虫的驱动蛋白衍生抗原，提高了人类和犬内脏利什曼病血清学检测的诊断性能。

PLoS Negl Trop Dis. 2021 Sep 17;15(9):e0009759. doi: 10.1371/journal.pntd.0009759. eCollection 2021 Sep.

DbStRiPs: Database of structural repeats in proteins.DbStRiPs：蛋白质结构重复数据库。

Protein Sci. 2022 Jan;31(1):23-36. doi: 10.1002/pro.4052. Epub 2021 Mar 6.

Stage-Specific Transcriptome and Proteome Analyses of the Filarial Parasite Onchocerca volvulus and Its Wolbachia Endosymbiont.丝虫寄生虫盘尾丝虫及其沃尔巴克氏体共生菌的阶段特异性转录组和蛋白质组分析

mBio. 2016 Nov 23;7(6):e02028-16. doi: 10.1128/mBio.02028-16.

Tandem Repeats in Proteins: Prediction Algorithms and Biological Role.蛋白质串联重复：预测算法与生物学作用。

Front Bioeng Biotechnol. 2015 Sep 24;3:143. doi: 10.3389/fbioe.2015.00143. eCollection 2015.

Comparative in-silico genome analysis of Leishmania (Leishmania) donovani: A step towards its species specificity.杜氏利什曼原虫（利什曼原虫属）的计算机模拟基因组比较分析：迈向其物种特异性的一步。

Meta Gene. 2014 Oct 24;2:782-98. doi: 10.1016/j.mgene.2014.10.003. eCollection 2014 Dec.

Homopolymer tract organization in the human malarial parasite Plasmodium falciparum and related Apicomplexan parasites.人类疟原虫恶性疟原虫及相关顶复门寄生虫中的同聚物序列组织

BMC Genomics. 2014 Oct 3;15(1):848. doi: 10.1186/1471-2164-15-848.

Homepeptide repeats: implications for protein structure, function and evolution.同源肽重复序列：对蛋白质结构、功能和进化的影响。

Genomics Proteomics Bioinformatics. 2012 Aug;10(4):217-25. doi: 10.1016/j.gpb.2012.04.001. Epub 2012 Aug 4.

Therapeutic vaccination with recombinant adenovirus reduces splenic parasite burden in experimental visceral leishmaniasis.用重组腺病毒进行治疗性疫苗接种可减少实验性内脏利什曼病的脾脏寄生虫负担。

J Infect Dis. 2012 Mar 1;205(5):853-63. doi: 10.1093/infdis/jir842. Epub 2012 Feb 1.

ProRepeat: an integrated repository for studying amino acid tandem repeats in proteins.ProRepeat：一个用于研究蛋白质中氨基酸串联重复的综合数据库。

Nucleic Acids Res. 2012 Jan;40(Database issue):D394-9. doi: 10.1093/nar/gkr1019. Epub 2011 Nov 18.

Stage-specific proteomic expression patterns of the human filarial parasite Brugia malayi and its endosymbiont Wolbachia.人丝虫寄生虫班氏吴策线虫及其共生菌沃尔巴克氏体的阶段特异性蛋白质组表达模式。

Proc Natl Acad Sci U S A. 2011 Jun 7;108(23):9649-54. doi: 10.1073/pnas.1011481108. Epub 2011 May 23.

本文引用的文献

Comparative genomic analysis of three Leishmania species that cause diverse human disease.三种引发不同人类疾病的利什曼原虫物种的比较基因组分析。

Nat Genet. 2007 Jul;39(7):839-47. doi: 10.1038/ng2053. Epub 2007 Jun 17.

ProtRepeatsDB: a database of amino acid repeats in genomes.ProtRepeatsDB：基因组中氨基酸重复序列数据库。

BMC Bioinformatics. 2006 Jul 7;7:336. doi: 10.1186/1471-2105-7-336.

Extensive antigenic polymorphism within the repeat sequence of the Plasmodium falciparum merozoite surface protein 1 block 2 is incorporated in a minimal polyvalent immunogen.恶性疟原虫裂殖子表面蛋白1第2区重复序列内广泛的抗原多态性被纳入一种最小化多价免疫原中。

Infect Immun. 2005 Sep;73(9):5928-35. doi: 10.1128/IAI.73.9.5928-5935.2005.

COPASAAR--a database for proteomic analysis of single amino acid repeats.COPASAAR——一个用于单氨基酸重复序列蛋白质组学分析的数据库。

BMC Bioinformatics. 2005 Aug 3;6:196. doi: 10.1186/1471-2105-6-196.

De novo identification of repeat families in large genomes.大型基因组中重复家族的从头鉴定。

Bioinformatics. 2005 Jun;21 Suppl 1:i351-8. doi: 10.1093/bioinformatics/bti1018.

The evolution of amino acid repeat arrays in Plasmodium and other organisms.疟原虫及其他生物体中氨基酸重复序列阵列的进化。

J Mol Evol. 2004 Oct;59(4):528-35. doi: 10.1007/s00239-004-2645-4.

Cysteine proteases of malaria parasites.疟原虫的半胱氨酸蛋白酶

Int J Parasitol. 2004 Dec;34(13-14):1489-99. doi: 10.1016/j.ijpara.2004.10.003.

Hyper-expansion of asparagines correlates with an abundance of proteins with prion-like domains in Plasmodium falciparum.天冬酰胺的过度扩增与恶性疟原虫中富含朊病毒样结构域的蛋白质数量相关。

Mol Biochem Parasitol. 2004 Oct;137(2):307-19. doi: 10.1016/j.molbiopara.2004.05.016.

Comparative analysis of amino acid repeats in rodents and humans.啮齿动物和人类中氨基酸重复序列的比较分析。

Genome Res. 2004 Apr;14(4):549-54. doi: 10.1101/gr.1925704.

Comparison of the A2 gene locus in Leishmania donovani and Leishmania major and its control over cutaneous infection.杜氏利什曼原虫和硕大利什曼原虫中A2基因座的比较及其对皮肤感染的控制。

J Biol Chem. 2003 Sep 12;278(37):35508-15. doi: 10.1074/jbc.M305030200. Epub 2003 Jun 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

RepSeq——一个存在于低等真核病原体中的氨基酸重复序列数据库。

RepSeq--a database of amino acid repeats present in lower eukaryotic pathogens.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献