Systems Biology/Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute Jena, Germany.
Septomics Research Centre, Friedrich Schiller University and Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute Jena, Germany.
Front Microbiol. 2015 Mar 11;6:171. doi: 10.3389/fmicb.2015.00171. eCollection 2015.
Sepsis is a clinical syndrome that can be caused by bacteria or fungi. Early knowledge on the nature of the causative agent is a prerequisite for targeted anti-microbial therapy. Besides currently used detection methods like blood culture and PCR-based assays, the analysis of the transcriptional response of the host to infecting organisms holds great promise. In this study, we aim to examine the transcriptional footprint of infections caused by the bacterial pathogens Staphylococcus aureus and Escherichia coli and the fungal pathogens Candida albicans and Aspergillus fumigatus in a human whole-blood model. Moreover, we use the expression information to build a random forest classifier to classify if a sample contains a bacterial, fungal, or mock-infection. After normalizing the transcription intensities using stably expressed reference genes, we filtered the gene set for biomarkers of bacterial or fungal blood infections. This selection is based on differential expression and an additional gene relevance measure. In this way, we identified 38 biomarker genes, including IL6, SOCS3, and IRG1 which were already associated to sepsis by other studies. Using these genes, we trained the classifier and assessed its performance. It yielded a 96% accuracy (sensitivities >93%, specificities >97%) for a 10-fold stratified cross-validation and a 92% accuracy (sensitivities and specificities >83%) for an additional test dataset comprising Cryptococcus neoformans infections. Furthermore, the classifier is robust to Gaussian noise, indicating correct class predictions on datasets of new species. In conclusion, this genome-wide approach demonstrates an effective feature selection process in combination with the construction of a well-performing classification model. Further analyses of genes with pathogen-dependent expression patterns can provide insights into the systemic host responses, which may lead to new anti-microbial therapeutic advances.
败血症是一种临床综合征,可能由细菌或真菌引起。早期了解病原体的性质是靶向抗菌治疗的前提。除了目前使用的检测方法,如血液培养和基于 PCR 的检测,宿主对感染物转录反应的分析具有很大的潜力。在这项研究中,我们旨在检查金黄色葡萄球菌和大肠杆菌等细菌病原体以及白色念珠菌和烟曲霉等真菌病原体在人全血模型中引起的感染的转录足迹。此外,我们使用表达信息构建随机森林分类器来对样本是否含有细菌、真菌或模拟感染进行分类。使用稳定表达的参考基因对转录强度进行归一化后,我们过滤了用于细菌或真菌血液感染的生物标志物的基因集。这种选择基于差异表达和额外的基因相关性测量。通过这种方式,我们鉴定了 38 个生物标志物基因,包括已经被其他研究与败血症相关的 IL6、SOCS3 和 IRG1。使用这些基因,我们训练了分类器并评估了其性能。它在 10 倍分层交叉验证中产生了 96%的准确率(敏感性>93%,特异性>97%),在包含新型隐球菌感染的额外测试数据集上的准确率为 92%(敏感性和特异性>83%)。此外,该分类器对高斯噪声具有鲁棒性,表明在新物种的数据集上可以正确预测类别。总之,这种全基因组方法证明了在构建性能良好的分类模型的同时,进行有效的特征选择过程。对具有病原体依赖性表达模式的基因进行进一步分析可以深入了解系统宿主反应,从而为新的抗菌治疗进展提供依据。