Abdallah Mutaz Mohammed, Ibrahim Suliman Ruaa Abdalla, Ahmed Yousra Tagelsir, Yahia Mawada
College of Life Sciences, Northeast Forestry University, Harbin, Heilongjiang 150040, China.
School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China.
J Genet Eng Biotechnol. 2025 Sep;23(3):100515. doi: 10.1016/j.jgeb.2025.100515. Epub 2025 Jun 19.
Bacterial genomes contain numerous hypothetical proteins (HPs) with uncharacterized roles. This study used computational methods to identify and predict the functions of such proteins in the Pseudomonas aeruginosa PAC1 strain.
The PAC1 genome (GenBank: CP053706.1) was analyzed, starting with 828 HPs. Proteins shorter than 50 amino acids (unlikely to form stable structures) were excluded, leaving 807 HPs. Physicochemical properties were assessed to filter unstable proteins, resulting in 272 candidates. Subcellular localization tools predicted cytoplasmic localization for 58 proteins. Functional annotation identified conserved domains, and homology modeling generated 3D structures for proteins with >80 % similarity to known templates. Structural validation and active site prediction were performed to assess biological relevance.
Two HPs, WP_003099663.1 (186 residues) and WP_010793930.1 (455 residues), exhibited structural stability and functional potential. WP_003099663.1 was annotated as a zinc-dependent enzyme involved in carbon dioxide regulation, while WP_010793930.1 was linked to amino acid biosynthesis. Structural models confirmed stable folds, and ligand-binding site predictions highlighted conserved regions, suggesting roles in metabolic pathways.
This study demonstrates a systematic computational approach for characterizing hypothetical proteins in bacterial genomes. WP_003099663.1 and WP_010793930.1 exhibit promising structural and functional features and warrant further experimental investigation.
细菌基因组包含许多功能未知的假设蛋白(HPs)。本研究采用计算方法来鉴定和预测铜绿假单胞菌PAC1菌株中此类蛋白的功能。
对PAC1基因组(GenBank:CP053706.1)进行分析,起始有828个假设蛋白。排除长度小于50个氨基酸的蛋白(不太可能形成稳定结构),剩余807个假设蛋白。评估理化性质以筛选不稳定蛋白,得到272个候选蛋白。亚细胞定位工具预测58个蛋白定位于细胞质。功能注释鉴定保守结构域,同源建模为与已知模板相似度>80%的蛋白生成三维结构。进行结构验证和活性位点预测以评估生物学相关性。
两个假设蛋白,WP_003099663.1(186个残基)和WP_010793930.1(455个残基),表现出结构稳定性和功能潜力。WP_003099663.1被注释为参与二氧化碳调节的锌依赖性酶,而WP_010793930.1与氨基酸生物合成相关。结构模型证实了稳定的折叠,配体结合位点预测突出了保守区域,表明其在代谢途径中的作用。
本研究展示了一种用于表征细菌基因组中假设蛋白的系统计算方法。WP_003099663.1和WP_010793930.1表现出有前景的结构和功能特征,值得进一步的实验研究。