Gruber Ansgar, Vohnoutová Marta, McKay Cedar, Rocap Gabrielle, Oborník Miroslav
Biology Centre, Institute of Parasitology, Czech Academy of Sciences, České Budějovice, Czech Republic.
Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic.
Plant J. 2025 Jun;122(5):e70138. doi: 10.1111/tpj.70138.
Plastids of diatoms and related algae with complex plastids of red algal origin are surrounded by four membranes, which also define the periplastidic compartment (PPC), the space between the second and third membranes. Metabolic reactions as well as cell biological processes take place in the PPC; however, genome-wide predictions of the proteins targeted to this compartment were so far based on manual annotation work. Using published experimental protein localizations as reference data, we developed the first automatic prediction method for PPC proteins, which we included as a new feature in an updated version of the plastid protein predictor ASAFind. With our method, at least a subset of the PPC proteins can be predicted with high specificity, with an estimate of at least 81 proteins (0.7% of the predicted proteome) targeted to the PPC in the model diatom Phaeodactylum tricornutum. The proportion of PPC proteins varies, since 180 PPC proteins (1.3% of the predicted proteome) were predicted in the genome of the diatom Thalassiosira pseudonana. The new ASAFind version can also generate a newly designed graphical output that visualizes the contribution of each position in the sequence to the score and accepts the output of the recent versions of SignalP (5.0) and TargetP (2.0) as input data. Furthermore, we release a script to calculate custom scoring matrices that can be used for predictions in a simplified score cut-off mode. This allows for adjustments of the method to other groups of algae.
硅藻及具有红藻起源复杂质体的相关藻类的质体被四层膜包围,这四层膜也界定了周质体间隔(PPC),即第二和第三层膜之间的空间。代谢反应以及细胞生物学过程在PPC中发生;然而,到目前为止,针对该间隔的靶向蛋白质的全基因组预测是基于人工注释工作的。我们以已发表的实验性蛋白质定位作为参考数据,开发了第一种针对PPC蛋白质的自动预测方法,并将其作为一个新功能纳入质体蛋白质预测器ASAFind的更新版本中。使用我们的方法,可以以高特异性预测至少一部分PPC蛋白质,据估计,在模式硅藻三角褐指藻中,至少有81种蛋白质(占预测蛋白质组的0.7%)靶向PPC。PPC蛋白质的比例有所不同,因为在硅藻假微型海链藻的基因组中预测到了180种PPC蛋白质(占预测蛋白质组的1.3%)。新的ASAFind版本还可以生成一种新设计的图形输出,直观显示序列中每个位置对得分的贡献,并接受最新版本的SignalP(5.0)和TargetP(2.0)的输出作为输入数据。此外,我们发布了一个脚本来计算自定义评分矩阵,可用于简化得分截止模式下的预测。这使得该方法能够针对其他藻类群体进行调整。