Suppr超能文献

在存在缺失数据的情况下使用逻辑回归进行生物标志物组合开发

Biomarker Panel Development Using Logic Regression in the Presence of Missing Data.

作者信息

Huang Ying, Dasgupta Sayan

机构信息

Vaccine & Infectious Disease Division, Fred Hutchinson Cancer Center, US.

出版信息

N Engl J Stat Data Sci. 2024 Apr;2(1):3-14. doi: 10.51387/24-nejsds59. Epub 2024 Jan 31.

Abstract

We consider the problem of developing flexible and parsimonious biomarker combinations for cancer early detection in the presence of variable missingness at random. Motivated by the need to develop biomarker panels in a cross-institute pancreatic cyst biomarker validation study, we propose logic-regression based methods for feature selection and construction of logic rules under a multiple imputation framework. We generate ensemble trees for classification decision, and further select a single decision tree for simplicity and interpretability. We demonstrate superior performance of the proposed methods compared to alternative methods based on complete-case data or single imputation. The methods are applied to the pancreatic cyst data to estimate biomarker panels for pancreatic cysts subtype classification and malignant potential prediction.

摘要

我们考虑在存在随机缺失值的情况下,开发灵活且简约的生物标志物组合用于癌症早期检测的问题。受跨机构胰腺囊肿生物标志物验证研究中开发生物标志物组的需求驱动,我们提出了基于逻辑回归的方法,用于在多重插补框架下进行特征选择和逻辑规则构建。我们生成用于分类决策的集成树,并进一步选择单个决策树以实现简单性和可解释性。与基于完整病例数据或单一插补的替代方法相比,我们证明了所提出方法的卓越性能。这些方法应用于胰腺囊肿数据,以估计用于胰腺囊肿亚型分类和恶性潜能预测的生物标志物组。

相似文献

2
Flexible variable selection in the presence of missing data.存在缺失数据时的灵活变量选择。
Int J Biostat. 2024 Feb 13;20(2):347-359. doi: 10.1515/ijb-2023-0059. eCollection 2024 Nov 1.
5
Multiple imputation with missing data indicators.带有缺失数据指标的多重插补。
Stat Methods Med Res. 2021 Dec;30(12):2685-2700. doi: 10.1177/09622802211047346. Epub 2021 Oct 13.

本文引用的文献

7
Variable selection in the presence of missing data: resampling and imputation.存在缺失数据时的变量选择:重采样与插补
Biostatistics. 2015 Jul;16(3):596-610. doi: 10.1093/biostatistics/kxv003. Epub 2015 Feb 18.
8
Multiple imputation in the presence of high-dimensional data.高维数据情形下的多重填补
Stat Methods Med Res. 2016 Oct;25(5):2021-2035. doi: 10.1177/0962280213511027. Epub 2013 Nov 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验