Suppr超能文献

分析一个肺癌患者数据集,重点是预测胸外科手术后一年的生存率。

Analyzing a Lung Cancer Patient Dataset with the Focus on Predicting Survival Rate One Year after Thoracic Surgery.

作者信息

Rezaei Hachesu Peyman, Moftian Nazila, Dehghani Mahsa, Samad Soltani Taha

机构信息

Department of Health Information technology, School of Health management and Informatics, Tabriz University of Medical Sciences, Tabriz, Iran. Email:

出版信息

Asian Pac J Cancer Prev. 2017 Jun 25;18(6):1531-1536. doi: 10.22034/APJCP.2017.18.6.1531.

Abstract

Background: Data mining, a new concept introduced in the mid-1990s, can help researchers to gain new, profound insights and facilitate access to unanticipated knowledge sources in biomedical datasets. Many issues in the medical field are concerned with the diagnosis of diseases based on tests conducted on individuals at risk. Early diagnosis and treatment can provide a better outcome regarding the survival of lung cancer patients. Researchers can use data mining techniques to create effective diagnostic models. The aim of this study was to evaluate patterns existing in risk factor data of for mortality one year after thoracic surgery for lung cancer. Methods: The dataset used in this study contained 470 records and 17 features. First, the most important variables involved in the incidence of lung cancer were extracted using knowledge discovery and datamining algorithms such as naive Bayes, maximum expectation and then, using a regression analysis algorithm, a questionnaire was developed to predict the risk of death one year after lung surgery. Outliers in the data were excluded and reported using the clustering algorithm. Finally, a calculator was designed to estimate the risk for one-year post-operative mortality based on a scorecard algorithm. Results: The results revealed the most important factor involved in increased mortality to be large tumor size. Roles for type II diabetes and preoperative dyspnea in lower survival were also identified. The greatest commonality in classification of patients was Forced expiratory volume in first second (FEV1), based on levels of which patients could be classified into different categories. Conclusion: Development of a questionnaire based on calculations to diagnose disease can be used to identify and fill knowledge gaps in clinical practice guidelines.

摘要

背景

数据挖掘是20世纪90年代中期引入的一个新概念,它可以帮助研究人员获得新的、深刻的见解,并便于获取生物医学数据集中意外的知识来源。医学领域的许多问题都与基于对有风险个体进行的检测来诊断疾病有关。早期诊断和治疗对于肺癌患者的生存可以带来更好的结果。研究人员可以使用数据挖掘技术来创建有效的诊断模型。本研究的目的是评估肺癌胸外科手术后一年死亡率的危险因素数据中存在的模式。方法:本研究使用的数据集包含470条记录和17个特征。首先,使用朴素贝叶斯、最大期望等知识发现和数据挖掘算法提取与肺癌发病率相关的最重要变量,然后,使用回归分析算法,编制一份问卷来预测肺手术后一年的死亡风险。使用聚类算法排除并报告数据中的异常值。最后,基于记分卡算法设计了一个计算器来估计术后一年死亡风险。结果:结果显示,导致死亡率增加的最重要因素是肿瘤体积大。还确定了II型糖尿病和术前呼吸困难在较低生存率中的作用。患者分类中最大的共性是第一秒用力呼气量(FEV1),根据其水平可将患者分为不同类别。结论:基于计算来诊断疾病的问卷的开发可用于识别和填补临床实践指南中的知识空白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca70/6373791/bc77e38547fc/APJCP-18-1531-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验