Cao Xiaolong, Gulati Mansi, Jiang Haobo
Department of Entomology and Plant Pathology, Oklahoma State University, Stillwater, OK 74078, USA.
Department of Entomology and Plant Pathology, Oklahoma State University, Stillwater, OK 74078, USA.
Insect Biochem Mol Biol. 2017 Sep;88:48-62. doi: 10.1016/j.ibmb.2017.07.008. Epub 2017 Aug 2.
Insect serine proteases (SPs) and serine protease homologs (SPHs) participate in digestion, defense, development, and other physiological processes. In mosquitoes, some clip-domain SPs and SPHs (i.e. CLIPs) have been investigated for possible roles in antiparasitic responses. In a recent test aimed at improving quality of gene models in the Anopheles gambiae genome using RNA-seq data, we observed various discrepancies between gene models in AgamP4.5 and corresponding sequences selected from those modeled by Cufflinks, Trinity and Bridger. Here we report a comparative analysis of the 337 SP-related proteins in A. gambiae by examining their domain structures, sequence diversity, chromosomal locations, and expression patterns. One hundred and ten CLIPs contain 1 to 5 clip domains in addition to their protease domains (PDs) or non-catalytic, protease-like domains (PLDs). They are divided into five subgroups: CLIPAs (22) are clip-PLD; CLIPBs (29), CLIPCs (12) and CLIPDs (14) are mainly clip-PD; most CLIPEs (33) have a domain structure of PD/PLD-PLD-clip-PLD. While expression of the CLIP genes in group-1 is generally low and detected in various tissue- and stage-specific RNA-seq libraries, some putative GPs/GPHs (i.e. single domain gut SPs/SPHs) in group-2 are highly expressed in midgut, whole larva or whole adult libraries. In comparison, 46 SPs, 26 SPHs, and 37 multi-domain SPs/SPHs (i.e. PD/PLD-PLD) in group-3 do not seem to be specifically expressed in digestive tract. There are 16 SPs and 2 SPH containing other types of putative regulatory domains (e.g. LDLa, CUB, Gd). Of the 337 SP and SPH genes, 159 were sorted into 46 groups (2-8 members/group) based on similar phylogenetic tree position, chromosomal location, and expression profile. This information and analysis, including improved gene models and protein sequences, constitute a solid foundation for functional analysis of the SP-related proteins in A. gambiae.
昆虫丝氨酸蛋白酶(SPs)和丝氨酸蛋白酶同源物(SPHs)参与消化、防御、发育及其他生理过程。在蚊子中,一些clip结构域的SPs和SPHs(即CLIPs)已被研究其在抗寄生虫反应中的可能作用。在最近一项旨在利用RNA测序数据提高冈比亚按蚊基因组基因模型质量的测试中,我们观察到AgamP4.5中的基因模型与从Cufflinks、Trinity和Bridger建模的序列中选择的相应序列之间存在各种差异。在此,我们通过检查337个与冈比亚按蚊SP相关蛋白的结构域结构、序列多样性、染色体定位和表达模式进行了比较分析。110个CLIPs除了其蛋白酶结构域(PDs)或非催化性、蛋白酶样结构域(PLDs)外,还含有1至5个clip结构域。它们分为五个亚组:CLIPAs(22个)是clip-PLD;CLIPBs(29个)、CLIPC(12个)和CLIPDs(14个)主要是clip-PD;大多数CLIPEs(33个)具有PD/PLD-PLD-clip-PLD的结构域结构。虽然第1组中CLIP基因的表达通常较低,且在各种组织和阶段特异性RNA测序文库中检测到,但第2组中的一些假定的GPs/GPHs(即单结构域肠道SPs/SPHs)在中肠、整个幼虫或整个成虫文库中高度表达。相比之下,第3组中的46个SPs、26个SPHs和37个多结构域SPs/SPHs(即PD/PLD-PLD)似乎不在消化道中特异性表达。有16个SPs和2个SPHs含有其他类型的假定调节结构域(如LDLa、CUB、Gd)。在337个SP和SPH基因中,159个根据相似的系统发育树位置、染色体定位和表达谱被分为46组(每组2 - 8个成员)。这些信息和分析,包括改进的基因模型和蛋白质序列,为冈比亚按蚊中与SP相关蛋白的功能分析奠定了坚实基础。