de Cássia Ruy Patrícia, Torrieri Raul, Toledo Juliano Simões, de Souza Alves Viviane, Cruz Angela Kaysel, Ruiz Jeronimo Conceição
Informática de Biossistemas, Centro de Pesquisas René Rachou - Fundação Oswaldo Cruz (FIOCRUZ), Belo Horizonte, MG, Brasil.
BMC Genomics. 2014 Dec 13;15(1):1100. doi: 10.1186/1471-2164-15-1100.
Proteins are composed of one or more amino acid chains and exhibit several structure levels. IDPs (intrinsically disordered proteins) represent a class of proteins that do not fold into any particular conformation and exist as dynamic ensembles in their native state. Due to their intrinsic adaptability, IDPs participate in many regulatory biological processes, including parasite immune escape. Using the information from trypanosomatids proteomes, we developed a pipeline for the identification, characterization and analysis of IDPs. The pipeline employs six disorder prediction methodologies and integrates structural and functional annotation information, subcellular location prediction and physicochemical properties. At the core of the IDP pipeline, there is a relational database that describes the protein disorder knowledge in a logically consistent manner.
The results obtained from the IDP pipeline showed that Leishmania and Trypanosoma species have approximately 70% and 55% IDPs, respectively. Our results indicate that IDPs in trypanosomatids contain disorder-promoting amino acids and order-promoting amino acids. The functional annotation analysis demonstrated enrichment of selected Gene Ontology terms. A relevant association was observed between the disordered residue numbers within predicted IDPs and their subcellular location, lack of transmembrane domains and lack of predicted function. We validated our computational findings with 2D electrophoresis designed for IDP identification and found that 100% of the identified protein spots were predicted in silico.
Because there is no pipeline or database addressing IDPs in trypanosomatids, the pipeline described here represents the first attempt to establish possible correlations between protein function and structural disorder in these eukaryotes. Interestingly, all significant associations detected in the contingency analysis were observed when the protein disorder content reached approximately 40%. The exploratory data analysis allowed us to develop hypotheses regarding the IDPs' association with key biological features of these parasites, including transcription and transcriptional regulation, RNA processing and splicing, and cytoskeleton.
蛋白质由一条或多条氨基酸链组成,并呈现出多个结构层次。内在无序蛋白(IDP)是一类不折叠成任何特定构象、在天然状态下以动态集合形式存在的蛋白质。由于其固有的适应性,IDP参与许多调节性生物过程,包括寄生虫免疫逃逸。利用锥虫蛋白质组的信息,我们开发了一套用于鉴定、表征和分析IDP的流程。该流程采用六种无序预测方法,并整合了结构和功能注释信息、亚细胞定位预测以及物理化学性质。在IDP流程的核心,有一个关系数据库,它以逻辑一致的方式描述蛋白质无序知识。
从IDP流程获得的结果表明,利什曼原虫和锥虫物种分别约有70%和55%的IDP。我们的结果表明,锥虫中的IDP含有促进无序的氨基酸和促进有序的氨基酸。功能注释分析表明所选基因本体术语富集。在预测的IDP内的无序残基数与其亚细胞定位、缺乏跨膜结构域和缺乏预测功能之间观察到相关关联。我们用专为IDP鉴定设计的二维电泳验证了我们的计算结果,发现100%的鉴定出的蛋白点在计算机模拟中被预测到。
由于没有针对锥虫中IDP的流程或数据库,这里描述的流程代表了在这些真核生物中建立蛋白质功能与结构无序之间可能相关性的首次尝试。有趣的是,当蛋白质无序含量达到约40%时,在列联分析中检测到的所有显著关联都被观察到。探索性数据分析使我们能够提出关于IDP与这些寄生虫的关键生物学特征(包括转录和转录调控、RNA加工和剪接以及细胞骨架)之间关联的假设。