Suppr超能文献

基于机器学习模型的柯克斯体持久性和分类的新土壤数据集的两阶段特征排序。

Two phase feature-ranking for new soil dataset for Coxiella burnetii persistence and classification using machine learning models.

机构信息

Department of Computer Science, University of Engineering and Technology, Lahore, Pakistan.

Quality Operations Laboratory, Institute of Microbiology, University of Veterinary and Animal Sciences, Lahore, Pakistan.

出版信息

Sci Rep. 2023 Jan 2;13(1):29. doi: 10.1038/s41598-022-26956-8.

Abstract

Coxiella burnetii (Cb) is a hardy, stealth bacterial pathogen lethal for humans and animals. Its tremendous resistance to the environment, ease of propagation, and incredibly low infectious dosage make it an attractive organism for biowarfare. Current research on the classification of Coxiella and features influencing its presence in the soil is generally confined to statistical techniques. Machine learning other than traditional approaches can help us better predict epidemiological modeling for this soil-based pathogen of public significance. We developed a two-phase feature-ranking technique for the pathogen on a new soil feature dataset. The feature ranking applies methods such as ReliefF (RLF), OneR (ONR), and correlation (CR) for the first phase and a combination of techniques utilizing weighted scores to determine the final soil attribute ranks in the second phase. Different classification methods such as Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Logistic Regression (LR), and Multi-Layer Perceptron (MLP) have been utilized for the classification of soil attribute dataset for Coxiella positive and negative soils. The feature-ranking methods established that potassium, chromium, cadmium, nitrogen, organic matter, and soluble salts are the most significant attributes. At the same time, manganese, clay, phosphorous, copper, and lead are the least contributing soil features for the prevalence of the bacteria. However, potassium is the most influential feature, and manganese is the least significant soil feature. The attribute ranking using RLF generates the most promising results among the ranking methods by generating an accuracy of 80.85% for MLP, 79.79% for LR, and 79.8% for LDA. Overall, SVM and MLP are the best-performing classifiers, where SVM yields an accuracy of 82.98% and 81.91% for attribute ranking by CR and RLF; and MLP generates an accuracy of 76.60% for ONR. Thus, machine models can help us better understand the environment, assisting in the prevalence of bacteria and decreasing the chances of false classification. Subsequently, this can assist in controlling epidemics and alleviating the devastating effect on the socio-economics of society.

摘要

贝氏柯克斯体(Coxiella burnetii,Cb)是一种顽强的、隐秘的细菌病原体,对人类和动物具有致命性。它对环境的极强抵抗力、易于繁殖以及极低的感染剂量,使其成为生物战剂的理想选择。目前,关于柯克斯体的分类以及影响其在土壤中存在的特征的研究一般仅限于统计技术。除了传统方法之外,机器学习可以帮助我们更好地预测这种具有公共意义的土壤病原体的流行病学模型。我们在一个新的土壤特征数据集上开发了一种针对病原体的两阶段特征排序技术。该特征排序在第一阶段应用 ReliefF(RLF)、OneR(ONR)和相关性(CR)等方法,在第二阶段应用利用加权分数确定最终土壤属性等级的组合技术。不同的分类方法,如支持向量机(SVM)、线性判别分析(LDA)、逻辑回归(LR)和多层感知机(MLP),已被用于对 Coxiella 阳性和阴性土壤的土壤属性数据集进行分类。特征排序方法确定钾、铬、镉、氮、有机物和可溶性盐是最重要的属性。同时,锰、粘土、磷、铜和铅是影响细菌流行的最不重要的土壤特征。然而,钾是最具影响力的特征,而锰是最不重要的土壤特征。使用 RLF 的属性排序通过为 MLP 生成 80.85%、LR 生成 79.79%和 LDA 生成 79.8%的准确率,在排序方法中产生了最有希望的结果。总的来说,SVM 和 MLP 是性能最好的分类器,其中 SVM 在 CR 和 RLF 的属性排序中分别产生 82.98%和 81.91%的准确率,MLP 在 ONR 中产生 76.60%的准确率。因此,机器模型可以帮助我们更好地了解环境,辅助细菌的流行,减少错误分类的可能性。随后,这可以帮助控制流行病并减轻对社会经济的破坏性影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c3d2/9807593/3bb5c2ddc1e7/41598_2022_26956_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验