Department of Computer Applications, University College of Engineering (BIT Campus), Anna University, Tiruchirappalli, 620024, India.
Department of Electrical and Electronics Engineering, University College of Engineering (BIT Campus), Anna University, Tiruchirappalli, 620024, India.
J Med Syst. 2019 May 24;43(7):201. doi: 10.1007/s10916-019-1297-2.
Statistical classifier and good accuracy is an essential part of the research in medical data mining. Accurate prediction of lung cancer is an essential step for making effective clinical decisions. After identifying the lung cancer, minimum scopes are available in the medications for patient living in the world. Hemoglobin level and TNM stage wise patients survival period has to be varied. Some group of people survival period is minimal and another group of people survival time is lengthy. This study is aimed to develop a prediction model with new clinical variables to predict lung cancer patients. It's based on revised 8th edition study of TNM in lung cancer. These new attributes are collected from SEER databases, Indian cancer hospitals and research centers. The collected new attributes are classified using supervised machine learning algorithms of linear regression, Naïve Bayes classifier and proposed algorithms of Gaussian K-Base NB classifier. In particular, for TNM stage 1 group of people with normal hemoglobin level (NHBL), that group of lung cancer patient quality of life is highly enhanced. Which proved by using supervised machine learning algorithms. The proposed algorithm classified the database in terms of with respect to tumor size and HB level and the results are confirmed in the R environment. The continuous attribute classification method to prove first level of TNM in lung cancer patient along with standard hemoglobin has to be maintained that the people survivability rate is higher than the smaller level of hemoglobin people survival rate. The Gaussian K-Base NB classifier is more effective than the existing machine learning algorithms for lung cancer prediction model. The proposed classification accuracy has measured using ROC methods.
统计分类器和良好的准确性是医学数据挖掘研究的重要组成部分。准确预测肺癌是做出有效临床决策的重要步骤。在确定肺癌后,世界范围内患者可用的药物治疗范围很有限。根据血红蛋白水平和 TNM 分期,患者的生存时间会有所不同。某些人群的生存时间最短,而另一部分人群的生存时间较长。本研究旨在开发一种具有新临床变量的预测模型,以预测肺癌患者。它基于肺癌第 8 版 TNM 修订版研究。这些新属性是从 SEER 数据库、印度癌症医院和研究中心收集的。使用线性回归、朴素贝叶斯分类器和高斯 K-基 NB 分类器的监督机器学习算法对收集到的新属性进行分类。特别是对于血红蛋白水平正常(NHBL)的 TNM 分期 1 组人群,通过使用监督机器学习算法证明了这一组肺癌患者的生活质量得到了极大的提高。该算法根据肿瘤大小和 HB 水平对数据库进行了分类,并在 R 环境中得到了验证。要证明肺癌患者第一级 TNM 与标准血红蛋白一起保持连续属性分类方法,那么人们的存活率就会高于血红蛋白水平较低的人群的存活率。与现有的机器学习算法相比,高斯 K-基 NB 分类器对肺癌预测模型更有效。使用 ROC 方法测量了建议的分类准确性。