• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于受限立方样条回归的贪婪节点选择算法

Greedy knot selection algorithm for restricted cubic spline regression.

作者信息

Arnes Jo Inge, Hapfelmeier Alexander, Horsch Alexander, Braaten Tonje

机构信息

Department of Computer Science, Faculty of Science and Technology, UiT The Arctic University of Norway, Tromsø, Norway.

Institute of AI and Informatics in Medicine, TUM School of Medicine, Technical University of Munich, Munich, Germany.

出版信息

Front Epidemiol. 2023 Dec 18;3:1283705. doi: 10.3389/fepid.2023.1283705. eCollection 2023.

DOI:10.3389/fepid.2023.1283705
PMID:38455941
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10910934/
Abstract

Non-linear regression modeling is common in epidemiology for prediction purposes or estimating relationships between predictor and response variables. Restricted cubic spline (RCS) regression is one such method, for example, highly relevant to Cox proportional hazard regression model analysis. RCS regression uses third-order polynomials joined at knot points to model non-linear relationships. The standard approach is to place knots by a regular sequence of quantiles between the outer boundaries. A regression curve can easily be fitted to the sample using a relatively high number of knots. The problem is then overfitting, where a regression model has a good fit to the given sample but does not generalize well to other samples. A low knot count is thus preferred. However, the standard knot selection process can lead to underperformance in the sparser regions of the predictor variable, especially when using a low number of knots. It can also lead to overfitting in the denser regions. We present a simple greedy search algorithm using a backward method for knot selection that shows reduced prediction error and Bayesian information criterion scores compared to the standard knot selection process in simulation experiments. We have implemented the algorithm as part of an open-source R-package, knutar.

摘要

非线性回归建模在流行病学中常用于预测目的或估计预测变量与响应变量之间的关系。受限立方样条(RCS)回归就是这样一种方法,例如,它与Cox比例风险回归模型分析高度相关。RCS回归使用在节点处连接的三阶多项式来模拟非线性关系。标准方法是通过外边界之间的分位数的规则序列来放置节点。使用相对较多的节点可以很容易地将回归曲线拟合到样本上。问题在于过拟合,即回归模型对给定样本拟合良好,但对其他样本的泛化能力不佳。因此,节点数量较少更为可取。然而,标准的节点选择过程可能会导致在预测变量的稀疏区域表现不佳,尤其是在使用较少节点时。它还可能导致在密集区域出现过拟合。我们提出了一种简单的贪心搜索算法,使用向后方法进行节点选择,在模拟实验中,与标准节点选择过程相比,该算法显示出预测误差和贝叶斯信息准则得分有所降低。我们已将该算法作为开源R包knutar的一部分来实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/b89158009047/fepid-03-1283705-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/cf6a37878342/fepid-03-1283705-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/04b5bda952cf/fepid-03-1283705-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/60fd0506d744/fepid-03-1283705-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/016e1881282f/fepid-03-1283705-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/cbfd8e61a1ab/fepid-03-1283705-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/cc90e1646a23/fepid-03-1283705-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/111ba4493ad6/fepid-03-1283705-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/e361599d57a1/fepid-03-1283705-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/0e5799d700a7/fepid-03-1283705-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/1ce2037d3674/fepid-03-1283705-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/5799bb742017/fepid-03-1283705-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/8ad94a9165eb/fepid-03-1283705-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/b89158009047/fepid-03-1283705-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/cf6a37878342/fepid-03-1283705-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/04b5bda952cf/fepid-03-1283705-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/60fd0506d744/fepid-03-1283705-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/016e1881282f/fepid-03-1283705-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/cbfd8e61a1ab/fepid-03-1283705-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/cc90e1646a23/fepid-03-1283705-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/111ba4493ad6/fepid-03-1283705-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/e361599d57a1/fepid-03-1283705-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/0e5799d700a7/fepid-03-1283705-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/1ce2037d3674/fepid-03-1283705-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/5799bb742017/fepid-03-1283705-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/8ad94a9165eb/fepid-03-1283705-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ac7/10910934/b89158009047/fepid-03-1283705-g013.jpg

相似文献

1
Greedy knot selection algorithm for restricted cubic spline regression.用于受限立方样条回归的贪婪节点选择算法
Front Epidemiol. 2023 Dec 18;3:1283705. doi: 10.3389/fepid.2023.1283705. eCollection 2023.
2
Modeling non-linear relationships in epidemiological data: The application and interpretation of spline models.模拟流行病学数据中的非线性关系:样条模型的应用与解读
Front Epidemiol. 2022 Aug 18;2:975380. doi: 10.3389/fepid.2022.975380. eCollection 2022.
3
A free-knot spline modeling framework for piecewise linear logistic regression in complex samples with body mass index and mortality as an example.以体重指数和死亡率为例的复杂样本中分段线性逻辑回归的自由节点样条建模框架。
Front Nutr. 2014 Sep 29;2014:00016. doi: 10.3389/fnut.2014.00016.
4
Radiation dose response estimation with emphasis on low dose range using restricted cubic splines: application to all solid cancer mortality data, 1950-2003, in atomic bomb survivors.使用受限立方样条对低剂量范围进行重点辐射剂量反应估计:应用于原子弹爆炸幸存者1950 - 2003年所有实体癌死亡率数据。
Health Phys. 2015 Jul;109(1):15-24. doi: 10.1097/HP.0000000000000293.
5
Optimized knot placement for B-splines in deformable image registration.优化 B 样条在形变图像配准中的结点放置。
Med Phys. 2011 Aug;38(8):4579-82. doi: 10.1118/1.3609416.
6
Selection of locations of knots for linear splines in random regression test-day models.随机回归测试日模型中线性样条节点位置的选择。
J Anim Breed Genet. 2010 Apr;127(2):87-92. doi: 10.1111/j.1439-0388.2009.00829.x.
7
Survival estimation through the cumulative hazard with monotone natural cubic splines using convex optimization-the HCNS approach.利用单调自然三次样条的累积风险进行生存估计的凸优化方法——HCNS 方法。
Comput Methods Programs Biomed. 2020 Jul;190:105357. doi: 10.1016/j.cmpb.2020.105357. Epub 2020 Jan 29.
8
Using fractional polynomials and restricted cubic splines to model non-proportional hazards or time-varying covariate effects in the Cox regression model.使用分数多项式和限制三次样条函数来模拟 Cox 回归模型中的非比例风险或时变协变量效应。
Stat Med. 2022 Feb 10;41(3):612-624. doi: 10.1002/sim.9259. Epub 2021 Nov 21.
9
Mortality-Air Pollution Associations in Low Exposure Environments (MAPLE): Phase 2.低暴露环境下死亡率与空气污染关联研究(MAPLE):第二阶段。
Res Rep Health Eff Inst. 2022 Jul;2022(212):1-91.
10
Restricted cubic splines for modelling periodic data.限制立方样条用于周期性数据建模。
PLoS One. 2020 Oct 28;15(10):e0241364. doi: 10.1371/journal.pone.0241364. eCollection 2020.

引用本文的文献

1
Viral infections and related fatal adverse events associated with complement inhibitors for PNH: a real-world pharmacovigilance analysis in FAERS.阵发性睡眠性血红蛋白尿症补体抑制剂相关的病毒感染及相关致命不良事件:基于FAERS的真实世界药物警戒分析
Front Pharmacol. 2025 Aug 11;16:1639685. doi: 10.3389/fphar.2025.1639685. eCollection 2025.
2
Safety and necessity of omitting mediastinal lymph node dissection in cN0/N1 non-small cell lung cancer after neoadjuvant immunotherapy.新辅助免疫治疗后cN0/N1期非小细胞肺癌省略纵隔淋巴结清扫术的安全性及必要性
Front Immunol. 2025 Apr 29;16:1587658. doi: 10.3389/fimmu.2025.1587658. eCollection 2025.
3

本文引用的文献

1
Using fractional polynomials and restricted cubic splines to model non-proportional hazards or time-varying covariate effects in the Cox regression model.使用分数多项式和限制三次样条函数来模拟 Cox 回归模型中的非比例风险或时变协变量效应。
Stat Med. 2022 Feb 10;41(3):612-624. doi: 10.1002/sim.9259. Epub 2021 Nov 21.
2
Cubic splines to model relationships between continuous variables and outcomes: a guide for clinicians.用于模拟连续变量与结果之间关系的三次样条曲线:临床医生指南
Bone Marrow Transplant. 2020 Apr;55(4):675-680. doi: 10.1038/s41409-019-0679-x. Epub 2019 Oct 1.
3
Data from the Human Penguin Project, a cross-national dataset testing social thermoregulation principles.
Association of Serum Total Bilirubin to Cholesterol Ratio With Progression of Chronic Kidney Disease in Patients With Type 2 Diabetes: A Retrospective Cohort Study.
2型糖尿病患者血清总胆红素与胆固醇比值与慢性肾脏病进展的关联:一项回顾性队列研究
J Diabetes. 2025 May;17(5):e70097. doi: 10.1111/1753-0407.70097.
4
Prevalence of anxiety disorder and its association with BMI: an analysis of women's experiences in Bangladesh using BDHS-2022 data.焦虑症的患病率及其与体重指数的关联:利用2022年孟加拉国人口与健康调查数据对孟加拉国女性经历的分析
BMC Public Health. 2025 Mar 26;25(1):1144. doi: 10.1186/s12889-025-22427-7.
5
Correlation between liver fibrosis in non-alcoholic fatty liver disease and insulin resistance indicators: a cross-sectional study from NHANES 2017-2020.非酒精性脂肪性肝病中肝纤维化与胰岛素抵抗指标的相关性:一项基于2017 - 2020年美国国家健康与营养检查调查(NHANES)的横断面研究
Front Endocrinol (Lausanne). 2025 Jan 31;16:1514093. doi: 10.3389/fendo.2025.1514093. eCollection 2025.
6
Endocrine disruptors and bladder function: the role of phthalates in overactive bladder.内分泌干扰物与膀胱功能:邻苯二甲酸盐在膀胱过度活动症中的作用
Front Public Health. 2024 Dec 11;12:1493794. doi: 10.3389/fpubh.2024.1493794. eCollection 2024.
7
Triglyceride glucose index is associated with vertebral fracture in older adults: a longitudinal study.甘油三酯葡萄糖指数与老年人椎体骨折相关:一项纵向研究。
Endocrine. 2025 Mar;87(3):1022-1030. doi: 10.1007/s12020-024-04136-0. Epub 2024 Dec 19.
8
Association between branched-chain amino acid levels and gastric cancer risk: large-scale prospective cohort study.支链氨基酸水平与胃癌风险之间的关联:大规模前瞻性队列研究。
Front Nutr. 2024 Nov 20;11:1479800. doi: 10.3389/fnut.2024.1479800. eCollection 2024.
9
Associations of genetic variation and mRNA expression of PDGF/PDGFRB pathway genes with coronary artery disease in the Chinese population.中国人群中 PDGF/PDGFRB 通路基因的遗传变异和 mRNA 表达与冠心病的关联。
J Cell Mol Med. 2024 Nov;28(22):e70193. doi: 10.1111/jcmm.70193.
10
Smoking Cessation and Incident Cardiovascular Disease.戒烟与心血管疾病事件。
JAMA Netw Open. 2024 Nov 4;7(11):e2442639. doi: 10.1001/jamanetworkopen.2024.42639.
人类企鹅项目的数据,一个跨国数据集,用于测试社会体温调节原则。
Sci Data. 2019 Apr 17;6(1):32. doi: 10.1038/s41597-019-0029-2.
4
A review of spline function procedures in R.R 中的样条函数过程综述。
BMC Med Res Methodol. 2019 Mar 6;19(1):46. doi: 10.1186/s12874-019-0666-3.
5
STRengthening analytical thinking for observational studies: the STRATOS initiative.加强观察性研究的分析思维:STRATOS倡议。
Stat Med. 2014 Dec 30;33(30):5413-32. doi: 10.1002/sim.6265. Epub 2014 Jul 30.
6
An Introduction to Model Selection.模型选择导论
J Math Psychol. 2000 Mar;44(1):41-61. doi: 10.1006/jmps.1999.1276.