Suppr超能文献

BacPaCS—基于稀疏 SVM 的细菌致病性分类。

BacPaCS-Bacterial Pathogenicity Classification via Sparse-SVM.

机构信息

Department of Computer Science, Faculty of Natural Sciences.

The Shraga Segal Department of Microbiology Immunology and Genetics, Faculty of Health Sciences, Ben-Gurion University of the Negev, BeerSheva, Israel.

出版信息

Bioinformatics. 2019 Jun 1;35(12):2001-2008. doi: 10.1093/bioinformatics/bty928.

Abstract

MOTIVATION

Bacterial infections are a major cause of illness worldwide. However, most bacterial strains pose no threat to human health and may even be beneficial. Thus, developing powerful diagnostic bioinformatic tools that differentiate pathogenic from commensal bacteria are critical for effective treatment of bacterial infections.

RESULTS

We propose a machine-learning approach for classifying human-hosted bacteria as pathogenic or non-pathogenic based on their genome-derived proteomes. Our approach is based on sparse Support Vector Machines (SVM), which autonomously selects a small set of genes that are related to bacterial pathogenicity. We implement our approach as a tool-'Bacterial Pathogenicity Classification via sparse-SVM' (BacPaCS)-which is fully automated and handles datasets significantly larger than those previously used. BacPaCS shows high accuracy in distinguishing pathogenic from non-pathogenic bacteria, in a clinically relevant dataset, comprising only human-hosted bacteria. Among the genes that received the highest positive weight in the resulting classifier, we found genes that are known to be related to bacterial pathogenicity, in addition to novel candidates, whose involvement in bacterial virulence was never reported.

AVAILABILITY AND IMPLEMENTATION

The code and the resulting model are available at: https://github.com/barashe/bacpacs.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

细菌感染是全球范围内疾病的主要原因。然而,大多数细菌菌株对人类健康没有威胁,甚至可能有益。因此,开发强大的诊断生物信息学工具,区分致病和共生细菌,对于有效治疗细菌感染至关重要。

结果

我们提出了一种基于基因组衍生蛋白质组将人类宿主细菌分类为致病性或非致病性的机器学习方法。我们的方法基于稀疏支持向量机(SVM),它自动选择与细菌致病性相关的一小部分基因。我们将我们的方法实现为一个工具-"通过稀疏 SVM 进行细菌致病性分类"(BacPaCS)-它是完全自动化的,并处理比以前使用的数据集大得多的数据集。BacPaCS 在区分致病性和非致病性细菌方面表现出很高的准确性,在一个仅包含人类宿主细菌的临床相关数据集中。在分类器中获得最高正权重的基因中,我们发现了一些已知与细菌致病性相关的基因,以及一些从未报道过与细菌毒力有关的新候选基因。

可用性和实现

代码和生成的模型可在以下网址获得:https://github.com/barashe/bacpacs。

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验