TNO Quality of Life, Zeist, the Netherlands.
BMC Genomics. 2010 Oct 19;11:584. doi: 10.1186/1471-2164-11-584.
The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation.
A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified.
We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions.
真菌物种的生态位、致病性及其作为微生物细胞工厂的有用性在很大程度上取决于其分泌组。蛋白质分泌通常需要存在 N 端信号肽 (SP),并且通过使用现有的高度准确的 SP 预测工具扫描此特征,可以直接预测潜在分泌蛋白的分数。然而,预测 SP 并不能保证该蛋白质实际上被分泌,并且当前的计算预测方法受到基因组注释过程中引入的基因模型错误的影响。
开发了一种基于多数规则的分类器,该分类器还评估了来自三个相邻 Aspergillus 物种的最佳同源物的信号肽预测,以创建由 Aspergillus niger 基因组编码的潜在信号肽含有的蛋白质的改进列表。作为这些计算预测的补充,使用鸟枪法蛋白质组学方法确定了与生长和碳源耗尽相关的分泌组。总体而言,鉴定出约 200 种具有预测信号肽的蛋白质为分泌蛋白。观察到分泌状态的一致变化是对生长/培养条件变化的响应。此外,还鉴定出两种通过在 Aspergillus niger 中起作用的非经典途径分泌的蛋白质。
我们通过组合来自相邻 Aspergilli 的不同基因模型预测,避免了与相关物种蛋白质组中缺乏同源序列的蛋白质的基因模型不准确相关的预测冲突,从而改进了 Aspergillus niger 分泌蛋白的计算清单。缺乏相关物种蛋白质组中同源序列的蛋白质的信号肽预测的预期准确性为 85%。对预测蛋白质组的实验验证证实了计算预测。