统计学习方法作为生存分析的预处理步骤：使用肺癌数据评估概念。

Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data.

机构信息

Department of Hematology and Medical Oncology, Emory University, Winship Cancer Institute, 1365 Clifton Road NE, Rm C-3090, Atlanta, GA 30322, USA.

出版信息

Biomed Eng Online. 2011 Nov 8;10:97. doi: 10.1186/1475-925X-10-97.

DOI:10.1186/1475-925X-10-97

PMID:22067671

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3280940/

Abstract

BACKGROUND

Statistical learning (SL) techniques can address non-linear relationships and small datasets but do not provide an output that has an epidemiologic interpretation.

METHODS

A small set of clinical variables (CVs) for stage-1 non-small cell lung cancer patients was used to evaluate an approach for using SL methods as a preprocessing step for survival analysis. A stochastic method of training a probabilistic neural network (PNN) was used with differential evolution (DE) optimization. Survival scores were derived stochastically by combining CVs with the PNN. Patients (n = 151) were dichotomized into favorable (n = 92) and unfavorable (n = 59) survival outcome groups. These PNN derived scores were used with logistic regression (LR) modeling to predict favorable survival outcome and were integrated into the survival analysis (i.e. Kaplan-Meier analysis and Cox regression). The hybrid modeling was compared with the respective modeling using raw CVs. The area under the receiver operating characteristic curve (Az) was used to compare model predictive capability. Odds ratios (ORs) and hazard ratios (HRs) were used to compare disease associations with 95% confidence intervals (CIs).

RESULTS

The LR model with the best predictive capability gave Az = 0.703. While controlling for gender and tumor grade, the OR = 0.63 (CI: 0.43, 0.91) per standard deviation (SD) increase in age indicates increasing age confers unfavorable outcome. The hybrid LR model gave Az = 0.778 by combining age and tumor grade with the PNN and controlling for gender. The PNN score and age translate inversely with respect to risk. The OR = 0.27 (CI: 0.14, 0.53) per SD increase in PNN score indicates those patients with decreased score confer unfavorable outcome. The tumor grade adjusted hazard for patients above the median age compared with those below the median was HR = 1.78 (CI: 1.06, 3.02), whereas the hazard for those patients below the median PNN score compared to those above the median was HR = 4.0 (CI: 2.13, 7.14).

CONCLUSION

We have provided preliminary evidence showing that the SL preprocessing may provide benefits in comparison with accepted approaches. The work will require further evaluation with varying datasets to confirm these findings.

摘要

背景

统计学习（SL）技术可以解决非线性关系和小数据集的问题，但无法提供具有流行病学解释的输出。

方法

使用一组小的临床变量（CVs）评估了一种使用 SL 方法作为生存分析预处理步骤的方法。使用差分进化（DE）优化训练概率神经网络（PNN）的随机方法。通过将 CVs 与 PNN 相结合，可以随机得出生存得分。将 151 名患者分为预后良好（n=92）和预后不良（n=59）两组。使用逻辑回归（LR）模型对这些 PNN 衍生的评分进行建模，以预测预后良好的生存结果，并将其整合到生存分析中（即 Kaplan-Meier 分析和 Cox 回归）。混合模型与使用原始 CVs 的相应模型进行了比较。使用受试者工作特征曲线下的面积（Az）来比较模型预测能力。使用比值比（ORs）和风险比（HRs）来比较疾病与 95%置信区间（CI）的关联。

结果

具有最佳预测能力的 LR 模型的 Az 值为 0.703。在控制性别和肿瘤分级的情况下，年龄每增加一个标准差（SD），OR=0.63（CI：0.43，0.91），表明年龄增加会导致预后不良。通过结合年龄和肿瘤分级以及控制性别，将 PNN 和年龄与 LR 混合模型结合使用，Az 值为 0.778。PNN 评分和年龄与风险呈反比关系。PNN 评分每增加一个 SD，OR=0.27（CI：0.14，0.53），表明评分降低的患者预后不良。与年龄中位数以下的患者相比，年龄中位数以上的患者的肿瘤分级调整后的危险比（HR）为 1.78（CI：1.06，3.02），而年龄中位数以上的患者与年龄中位数以下的患者相比，PNN 评分较高的患者的危险比（HR）为 4.0（CI：2.13，7.14）。

结论

我们已经提供了初步证据，表明与公认的方法相比，SL 预处理可能会带来好处。需要进一步使用不同的数据集进行评估，以确认这些发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0b2/3280940/ca6c18517363/1475-925X-10-97-1.jpg

相似文献

Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data.统计学习方法作为生存分析的预处理步骤：使用肺癌数据评估概念。

Biomed Eng Online. 2011 Nov 8;10:97. doi: 10.1186/1475-925X-10-97.

Survival analysis of patients with stage I non-small-cell lung cancer using clinical and DNA repair pathway expression variables.采用临床和 DNA 修复途径表达变量对 I 期非小细胞肺癌患者进行生存分析。

Clin Lung Cancer. 2013 Mar;14(2):128-38. doi: 10.1016/j.cllc.2012.06.001. Epub 2012 Aug 21.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Sublobar resection is equivalent to lobectomy for clinical stage 1A lung cancer in solid nodules.亚肺叶切除术与肺叶切除术在实性结节临床Ⅰ A 期肺癌中的疗效相当。

J Thorac Cardiovasc Surg. 2014 Feb;147(2):754-62; Discussion 762-4. doi: 10.1016/j.jtcvs.2013.09.065. Epub 2013 Nov 23.

Mortality and Morbidity Effects of Long-Term Exposure to Low-Level PM, BC, NO, and O: An Analysis of European Cohorts in the ELAPSE Project.长期暴露于低水平 PM、BC、NO 和 O 对死亡率和发病率的影响：ELAPSE 项目中欧洲队列的分析。

Res Rep Health Eff Inst. 2021 Sep;2021(208):1-127.

Development and Validation of a Deep Learning Model for Non-Small Cell Lung Cancer Survival.深度学习模型在非小细胞肺癌生存预测中的建立与验证。

JAMA Netw Open. 2020 Jun 1;3(6):e205842. doi: 10.1001/jamanetworkopen.2020.5842.

A new PET/CT volumetric prognostic index for non-small cell lung cancer.一种用于非小细胞肺癌的新型PET/CT体积预后指数。

Lung Cancer. 2015 Jul;89(1):43-9. doi: 10.1016/j.lungcan.2015.03.023. Epub 2015 Apr 9.

Artificial neural networks and logistic regression as tools for prediction of survival in patients with Stages I and II non-small cell lung cancer.人工神经网络和逻辑回归作为预测I期和II期非小细胞肺癌患者生存率的工具。

Mod Pathol. 1998 Jul;11(7):618-25.

Survival after community diagnosis of early-stage non-small cell lung cancer.社区诊断早期非小细胞肺癌后的生存情况。

Am J Med. 2014 May;127(5):443-9. doi: 10.1016/j.amjmed.2013.12.023. Epub 2014 Jan 28.

Exploring the survival prognosis of lung adenocarcinoma based on the cancer genome atlas database using artificial neural network.基于癌症基因组图谱数据库，利用人工神经网络探索肺腺癌的生存预后。

Medicine (Baltimore). 2019 May;98(20):e15642. doi: 10.1097/MD.0000000000015642.

引用本文的文献

Nongaussian Intravoxel Incoherent Motion Diffusion Weighted and Fast Exchange Regime Dynamic Contrast-Enhanced-MRI of Nasopharyngeal Carcinoma: Preliminary Study for Predicting Locoregional Failure.非高斯体素内不相干运动扩散加权及快速交换机制动态对比增强磁共振成像在鼻咽癌中的应用：预测局部区域复发的初步研究

Cancers (Basel). 2021 Mar 6;13(5):1128. doi: 10.3390/cancers13051128.

Empirically-derived synthetic populations to mitigate small sample sizes.通过经验得出的合成人群以缓解小样本量问题。

J Biomed Inform. 2020 May;105:103408. doi: 10.1016/j.jbi.2020.103408. Epub 2020 Mar 12.

Automated Percentage of Breast Density Measurements for Full-field Digital Mammography Applications.全视野数字乳腺摄影应用中乳腺密度测量的自动化百分比

Acad Radiol. 2014 Aug;21(8):958-70. doi: 10.1016/j.acra.2014.04.006.

Breast Imaging Reporting and Data System (BI-RADS) breast composition descriptors: automated measurement development for full field digital mammography.乳腺成像报告和数据系统（BI-RADS）乳腺成分描述符：全视野数字化乳腺摄影的自动测量方法开发。

Med Phys. 2013 Nov;40(11):113502. doi: 10.1118/1.4824319.

Clin Lung Cancer. 2013 Mar;14(2):128-38. doi: 10.1016/j.cllc.2012.06.001. Epub 2012 Aug 21.

本文引用的文献

The changing pattern of non-small cell lung cancer between the 90 and 2000 decades.20世纪90年代至21世纪初非小细胞肺癌模式的变化。

Open Respir Med J. 2011;5:24-30. doi: 10.2174/1874306401105010024. Epub 2011 Jun 21.

Reduced lung-cancer mortality with low-dose computed tomographic screening.低剂量计算机断层扫描筛查可降低肺癌死亡率。

N Engl J Med. 2011 Aug 4;365(5):395-409. doi: 10.1056/NEJMoa1102873. Epub 2011 Jun 29.

Inconsistencies in findings from the early lung cancer action project studies of lung cancer screening.早期肺癌行动计划研究中肺癌筛查结果的不一致性。

J Natl Cancer Inst. 2011 Jul 6;103(13):1002-6. doi: 10.1093/jnci/djr202. Epub 2011 Jun 17.

Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression.统计学习技术在流行病学中的应用：基于逻辑回归的模拟病例对照比较研究。

BMC Bioinformatics. 2011 Jan 27;12:37. doi: 10.1186/1471-2105-12-37.

Empirical estimates of the lead time distribution for prostate cancer based on two independent representative cohorts of men not subject to prostate-specific antigen screening.基于未接受前列腺特异性抗原筛查的男性的两个独立代表性队列的前列腺癌领先时间分布的经验估计。

Cancer Epidemiol Biomarkers Prev. 2010 May;19(5):1201-7. doi: 10.1158/1055-9965.EPI-09-1251. Epub 2010 Apr 20.

Fast-food consumption, diet quality, and neighborhood exposure to fast food: the multi-ethnic study of atherosclerosis.快餐消费、饮食质量与社区快餐暴露：动脉粥样硬化多民族研究

Am J Epidemiol. 2009 Jul 1;170(1):29-36. doi: 10.1093/aje/kwp090. Epub 2009 May 8.

Maternal age and infant mortality: a test of the Wilcox-Russell hypothesis.母亲年龄与婴儿死亡率：对威尔科克斯 - 拉塞尔假说的检验

Am J Epidemiol. 2009 Feb 1;169(3):294-303. doi: 10.1093/aje/kwn308. Epub 2008 Nov 21.

Are racial disparities in preterm birth larger in hypersegregated areas?在高度隔离的地区，早产方面的种族差异是否更大？

Am J Epidemiol. 2008 Jun 1;167(11):1295-304. doi: 10.1093/aje/kwn043. Epub 2008 Mar 25.

Probabilistic neural networks and the polynomial Adaline as complementary techniques for classification.概率神经网络与多项式Adaline作为分类的互补技术。

IEEE Trans Neural Netw. 1990;1(1):111-21. doi: 10.1109/72.80210.

Differential dietary nutrient intake according to hormone replacement therapy use: an underestimated confounding factor in epidemiologic studies?根据激素替代疗法的使用情况，膳食营养素摄入量存在差异：这是流行病学研究中一个被低估的混杂因素吗？

Am J Epidemiol. 2007 Dec 15;166(12):1451-60. doi: 10.1093/aje/kwm162. Epub 2007 Aug 13.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

统计学习方法作为生存分析的预处理步骤：使用肺癌数据评估概念。

Statistical learning methods as a preprocessing step for survival analysis: evaluation of concept using lung cancer data.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献