Ioannidou Zoi S, Theodoropoulou Margarita C, Papandreou Nikos C, Willis Judith H, Hamodrakas Stavros J
Department of Cell Biology and Biophysics, Faculty of Biology, University of Athens, Panepistimiopolis, Athens 157 01, Greece.
Department of Cellular Biology, University of Georgia, Athens, GA 30602, USA.
Insect Biochem Mol Biol. 2014 Sep;52:51-9. doi: 10.1016/j.ibmb.2014.06.004. Epub 2014 Jun 27.
The arthropod cuticle is a composite, bipartite system, made of chitin filaments embedded in a proteinaceous matrix. The physical properties of cuticle are determined by the structure and the interactions of its two major components, cuticular proteins (CPs) and chitin. The proteinaceous matrix consists mainly of structural cuticular proteins. The majority of the structural proteins that have been described to date belong to the CPR family, and they are identified by the conserved R&R region (Rebers and Riddiford Consensus). Two major subfamilies of the CPR family RR-1 and RR-2, have also been identified from conservation at sequence level and some correlation with the cuticle type. Recently, several novel families, also containing characteristic conserved regions, have been described. The package HMMER v3.0 (http://hmmer.janelia.org/) was used to build characteristic profile Hidden Markov Models based on the characteristic regions for 8 of these families, (CPF, CPAP3, CPAP1, CPCFC, CPLCA, CPLCG, CPLCW, Tweedle). In brief, these families can be described as having: CPF (a conserved region with 44 amino acids); CPAP1 and CPAP-3 (analogous to peritrophins, with 1 and 3 chitin-binding domains, respectively); CPCFC (2 or 3 C-x(5)-C repeats); and four of five low complexity (LC) families, each with characteristic domains. Using these models, as well as the models previously created for the two major subfamilies of the CPR family, RR-1 and RR-2 (Karouzou et al., 2007), we developed CutProtFam-Pred, an on-line tool (http://bioinformatics.biol.uoa.gr/CutProtFam-Pred) that allows one to query sequences from proteomes or translated transcriptomes, for the accurate detection and classification of putative structural cuticular proteins. The tool has been applied successfully to diverse arthropod proteomes including a crustacean (Daphnia pulex) and a chelicerate (Tetranychus urticae), but at this taxonomic distance only CPRs and CPAPs were recovered.
节肢动物的角质层是一个复合的二分系统,由嵌入蛋白质基质中的几丁质细丝组成。角质层的物理特性由其两个主要成分——角质层蛋白(CPs)和几丁质的结构及相互作用决定。蛋白质基质主要由结构性角质层蛋白组成。迄今为止描述的大多数结构蛋白都属于CPR家族,它们通过保守的R&R区域(Rebers和Riddiford共有序列)来识别。CPR家族的两个主要亚家族RR-1和RR-2,也已根据序列水平的保守性以及与角质层类型的一些相关性得以确定。最近,还描述了几个同样包含特征性保守区域的新家族。使用软件包HMMER v3.0(http://hmmer.janelia.org/)基于其中8个家族(CPF、CPAP3、CPAP1、CPCFC、CPLCA、CPLCG、CPLCW、Tweedle)的特征区域构建了特征轮廓隐马尔可夫模型。简而言之,这些家族的特点如下:CPF(一个有44个氨基酸的保守区域);CPAP1和CPAP-3(分别类似于围食膜蛋白,有1个和3个几丁质结合结构域);CPCFC(2个或3个C-x(5)-C重复序列);以及五个低复杂度(LC)家族中的四个,每个家族都有特征性结构域。利用这些模型,以及之前为CPR家族的两个主要亚家族RR-1和RR-2创建的模型(Karouzou等人,2007年),我们开发了CutProtFam-Pred,这是一个在线工具(http://bioinformatics.biol.uoa.gr/CutProtFam-Pred),可用于查询蛋白质组或翻译转录组中的序列,以准确检测和分类假定的结构性角质层蛋白。该工具已成功应用于包括甲壳类动物(水蚤)和螯肢动物(二斑叶螨)在内的多种节肢动物蛋白质组,但在这种分类学距离下,仅找回了CPR和CPAP。