Department of Computer and Information Sciences, College of Science and Technology (CST), Covenant University, Ota, Ogun State, Nigeria.
Sci Rep. 2021 Jul 20;11(1):14806. doi: 10.1038/s41598-021-94347-6.
Tuberculosis has the most considerable death rate among diseases caused by a single micro-organism type. The disease is a significant issue for most third-world countries due to poor diagnosis and treatment potentials. Early diagnosis of tuberculosis is the most effective way of managing the disease in patients to reduce the mortality rate of the infection. Despite several methods that exist in diagnosing tuberculosis, the limitations ranging from the cost in carrying out the test to the time taken to obtain the results have hindered early diagnosis of the disease. This work aims to develop a predictive model that would help in the diagnosis of TB using an extended weighted voting ensemble method. The method used to carry out this research involved analyzing tuberculosis gene expression data obtained from GEO (Transcript Expression Omnibus) database and developing a classification model to aid tuberculosis diagnosis. A classifier combination of Naïve Bayes (NB), and Support Vector Machine (SVM) was used to develop the classification model. The weighted voting ensemble technique was used to improve the classification model's performance by combining the classification results of the single classifier and selecting the group with the highest vote based on the weights given to the single classifiers. Experimental analysis indicates a performance accuracy of the enhanced ensemble classifier as 0.95, which showed a better performance than the single classifiers, which had 0.92, and 0.87 obtained from SVM and NB, respectively. The developed model can also assist health practitioners in the timely diagnosis of tuberculosis, which would reduce the mortality rate caused by the disease, especially in developing countries.
结核病是由单一微生物引起的疾病中死亡率最高的疾病。由于诊断和治疗能力差,该病是大多数第三世界国家的一个重大问题。早期诊断结核病是管理患者疾病以降低感染死亡率的最有效方法。尽管有几种方法可用于诊断结核病,但从进行测试的成本到获得结果所需的时间等限制,都阻碍了对该疾病的早期诊断。这项工作旨在开发一种预测模型,该模型使用扩展加权投票集成方法帮助诊断结核病。进行这项研究所采用的方法包括分析从 GEO(转录表达组学)数据库获得的结核病基因表达数据,并开发一种分类模型来辅助结核病诊断。使用朴素贝叶斯(NB)和支持向量机(SVM)的分类器组合来开发分类模型。使用加权投票集成技术来提高分类模型的性能,该技术通过组合单个分类器的分类结果,并根据赋予单个分类器的权重选择票数最高的组。实验分析表明,增强型集成分类器的性能准确性为 0.95,优于单分类器的性能,SVM 和 NB 分别为 0.92 和 0.87。所开发的模型还可以帮助医疗保健从业者及时诊断结核病,从而降低疾病造成的死亡率,特别是在发展中国家。