Mohan Amrita, Sullivan William J, Radivojac Predrag, Dunker A Keith, Uversky Vladimir N
School of Informatics, Indiana University, Bloomington, IN 47401, USA.
Mol Biosyst. 2008 Apr;4(4):328-40. doi: 10.1039/b719168e. Epub 2008 Feb 21.
Parasitic protozoal infections have long been known to cause profound degrees of sickness and death in humans as well as animal populations. Despite the increase in the number of annotated genomes available for a large variety of protozoa, a great deal more has yet to be learned about them, from their fundamental physiology to mechanisms invoked during host-pathogen interactions. Most of these genomes share a common feature, namely a high prevalence of low complexity regions in their predicted proteins, which is believed to contribute to the uniqueness of the individual species within this diverse group of early-branching eukaryotes. In the case of Plasmodium species, which cause malaria, such regions have also been reported to hamper the identification of homologues, thus making functional genomics exceptionally challenging. One of the better accepted theories accounting for the high number of low complexity regions is the presence of intrinsic disorder in these microbes. In this study we compare the degree of disordered proteins that are predicted to be expressed in many such ancient eukaryotic cells. Our findings indicate an unusual bias in the amino acids comprising protozoal proteomes, and show that intrinsic disorder is remarkably abundant among their predicted proteins. Additionally, the intrinsically disordered regions tend to be considerably longer in the early-branching eukaryotes. An analysis of a Plasmodium falciparum interactome indicates that protein-protein interactions may be at least one function of the intrinsic disorder. This study provides a bioinfomatics basis for the discovery and analysis of unfoldomes (the complement of intrinsically disordered proteins in a given proteome) of early-branching eukaryotes. It also provides new insights into the evolution of intrinsic disorder in the context of adapting to a parasitic lifestyle and lays the foundation for further work on the subject.
长期以来,人们都知道寄生原生动物感染会在人类以及动物群体中导致严重疾病和死亡。尽管可获得大量不同原生动物的注释基因组数量有所增加,但从它们的基本生理学到宿主 - 病原体相互作用过程中涉及的机制,仍有许多有待了解。这些基因组大多具有一个共同特征,即在其预测蛋白质中低复杂性区域的高发生率,据信这有助于这一早期分支真核生物的不同群体中各个物种的独特性。就导致疟疾的疟原虫物种而言,据报道此类区域也会妨碍同源物的鉴定,从而使功能基因组学极具挑战性。解释低复杂性区域数量众多的一个较被认可的理论是这些微生物中存在内在无序性。在本研究中,我们比较了预计在许多此类古老真核细胞中表达的无序蛋白质的程度。我们的研究结果表明,构成原生动物蛋白质组的氨基酸存在异常偏差,并表明其预测蛋白质中内在无序性非常丰富。此外,在早期分支真核生物中,内在无序区域往往长得多。对恶性疟原虫相互作用组的分析表明,蛋白质 - 蛋白质相互作用可能至少是内在无序性的一种功能。本研究为早期分支真核生物的解折叠组(给定蛋白质组中内在无序蛋白质的互补物)的发现和分析提供了生物信息学基础。它还为在适应寄生生活方式的背景下内在无序性的进化提供了新见解,并为该主题的进一步研究奠定了基础。