• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

针对含删失数据的大规模Cox模型的一种最优子抽样设计。

An optimal subsampling design for large-scale Cox model with censored data.

作者信息

Liu Shiqi, Xie Zilong, Zheng Ming, Yu Wen

机构信息

Department of Statistics and Data Science, School of Management, Fudan University, Shanghai, People's Republic of China.

School of Mathematical Sciences, Fudan University, Shanghai, People's Republic of China.

出版信息

J Appl Stat. 2024 Nov 4;52(7):1315-1341. doi: 10.1080/02664763.2024.2423234. eCollection 2025.

DOI:10.1080/02664763.2024.2423234
PMID:40453364
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12123965/
Abstract

Subsampling designs are useful for reducing computational load and storage cost for large-scale data analysis. For massive survival data with right censoring, we propose a class of optimal subsampling designs under the widely-used Cox model. The proposed designs utilize information from both the outcome and the covariates. Different forms of the design can be derived adaptively to meet various targets, such as optimizing the overall estimation accuracy or minimizing the variation of specific linear combination of the estimators. Given the subsampled data, the inverse probability weighting approach is employed to estimate the model parameters. The resultant estimators are shown to be consistent and asymptotically normally distributed. Simulation results indicate that the proposed subsampling design yields more efficient estimators than the uniform subsampling by using subsampled data of comparable sample sizes. Additionally, the subsampling estimation significantly reduces the computational load and storage cost relative to the full data estimation. An analysis of a real data example is provided for illustration.

摘要

子抽样设计对于大规模数据分析中减少计算量和存储成本很有用。对于带有右删失的海量生存数据,我们在广泛使用的Cox模型下提出了一类最优子抽样设计。所提出的设计利用了来自结果和协变量两方面的信息。可以自适应地导出不同形式的设计以满足各种目标,比如优化整体估计精度或最小化估计量特定线性组合的方差。给定子抽样数据后,采用逆概率加权方法来估计模型参数。结果表明所得估计量是一致的且渐近正态分布。模拟结果表明,通过使用具有可比样本量的子抽样数据,所提出的子抽样设计比均匀子抽样产生更有效的估计量。此外,相对于全数据估计,子抽样估计显著降低了计算量和存储成本。提供了一个实际数据例子的分析用于说明。

相似文献

1
An optimal subsampling design for large-scale Cox model with censored data.针对含删失数据的大规模Cox模型的一种最优子抽样设计。
J Appl Stat. 2024 Nov 4;52(7):1315-1341. doi: 10.1080/02664763.2024.2423234. eCollection 2025.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
4
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
5
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
6
Carbon dioxide detection for diagnosis of inadvertent respiratory tract placement of enterogastric tubes in children.用于诊断儿童肠胃管意外置入呼吸道的二氧化碳检测
Cochrane Database Syst Rev. 2025 Feb 19;2(2):CD011196. doi: 10.1002/14651858.CD011196.pub2.
7
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。
Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.
8
Topotecan, pegylated liposomal doxorubicin hydrochloride and paclitaxel for second-line or subsequent treatment of advanced ovarian cancer: a systematic review and economic evaluation.拓扑替康、聚乙二醇化脂质体盐酸多柔比星和紫杉醇用于晚期卵巢癌二线或后续治疗:一项系统评价和经济学评估
Health Technol Assess. 2006 Mar;10(9):1-132. iii-iv. doi: 10.3310/hta10090.
9
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
10
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

本文引用的文献

1
Sampling-based estimation for massive survival data with additive hazards model.基于抽样的加性风险模型在海量生存数据分析中的估计。
Stat Med. 2021 Jan 30;40(2):441-450. doi: 10.1002/sim.8783. Epub 2020 Nov 3.
2
Optimal Subsampling for Large Sample Logistic Regression.大样本逻辑回归的最优子采样
J Am Stat Assoc. 2018;113(522):829-844. doi: 10.1080/01621459.2017.1292914. Epub 2018 Jun 6.
3
The Surveillance, Epidemiology, and End Results (SEER) Program and Pathology: Toward Strengthening the Critical Relationship.监测、流行病学与最终结果(SEER)计划与病理学:致力于强化关键关系
Am J Surg Pathol. 2016 Dec;40(12):e94-e102. doi: 10.1097/PAS.0000000000000749.
4
LOCAL CASE-CONTROL SAMPLING: EFFICIENT SUBSAMPLING IN IMBALANCED DATA SETS.局部病例对照抽样:不平衡数据集中的高效子抽样
Ann Stat. 2014 Oct 1;42(5):1693-1724. doi: 10.1214/14-AOS1220.
5
Statistical aspects of the analysis of data from retrospective studies of disease.疾病回顾性研究数据的统计分析方面
J Natl Cancer Inst. 1959 Apr;22(4):719-48.
6
Statistical methods in cancer research. Volume I - The analysis of case-control studies.癌症研究中的统计方法。第一卷——病例对照研究的分析
IARC Sci Publ. 1980(32):5-338.
7
Estimability and estimation in case-referent studies.病例对照研究中的可估计性与估计
Am J Epidemiol. 1976 Feb;103(2):226-35. doi: 10.1093/oxfordjournals.aje.a112220.