使用连接点回归模型对趋势数据进行聚类。

Clustering of trend data using joinpoint regression models.

作者信息

Kim Hyune-Ju, Luo Jun, Kim Jeankyung, Chen Huann-Sheng, Feuer Eric J

机构信息

Department of Mathematics, Syracuse University, Syracuse, NY, 13244, U.S.A.

出版信息

Stat Med. 2014 Oct 15;33(23):4087-103. doi: 10.1002/sim.6221. Epub 2014 Jun 3.

DOI:10.1002/sim.6221

PMID:24895073

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4159412/

Abstract

In this paper, we propose methods to cluster groups of two-dimensional data whose mean functions are piecewise linear into several clusters with common characteristics such as the same slopes. To fit segmented line regression models with common features for each possible cluster, we use a restricted least squares method. In implementing the restricted least squares method, we estimate the maximum number of segments in each cluster by using both the permutation test method and the Bayes information criterion method and then propose to use the Bayes information criterion to determine the number of clusters. For a more effective implementation of the clustering algorithm, we propose a measure of the minimum distance worth detecting and illustrate its use in two examples. We summarize simulation results to study properties of the proposed methods and also prove the consistency of the cluster grouping estimated with a given number of clusters. The presentation and examples in this paper focus on the segmented line regression model with the ordered values of the independent variable, which has been the model of interest in cancer trend analysis, but the proposed method can be applied to a general model with design points either ordered or unordered.

摘要

在本文中，我们提出了一些方法，用于将均值函数为分段线性的二维数据组聚类为具有相同斜率等共同特征的几个簇。为了对每个可能的簇拟合具有共同特征的分段线性回归模型，我们使用了一种受限最小二乘法。在实施受限最小二乘法时，我们通过排列检验法和贝叶斯信息准则法估计每个簇中的最大段数，然后建议使用贝叶斯信息准则来确定簇的数量。为了更有效地实施聚类算法，我们提出了一种值得检测的最小距离度量，并在两个示例中说明了其用法。我们总结了模拟结果以研究所提方法的性质，并证明了给定簇数下估计的簇分组的一致性。本文中的介绍和示例主要关注自变量有序值的分段线性回归模型，该模型一直是癌症趋势分析中感兴趣的模型，但所提方法可应用于设计点有序或无序的一般模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1140/4159412/ceeb22242930/nihms-599728-f0001.jpg

相似文献

Clustering of trend data using joinpoint regression models.使用连接点回归模型对趋势数据进行聚类。

Stat Med. 2014 Oct 15;33(23):4087-103. doi: 10.1002/sim.6221. Epub 2014 Jun 3.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Binary genetic algorithm for optimal joinpoint detection: Application to cancer trend analysis.用于最优折点检测的二元遗传算法：在癌症趋势分析中的应用

Stat Med. 2021 Feb 10;40(3):799-822. doi: 10.1002/sim.8803. Epub 2020 Nov 17.

Early estimates of SEER cancer incidence, 2014.2014年美国监测、流行病学和最终结果（SEER）癌症发病率的早期估计。

Cancer. 2017 Jul 1;123(13):2524-2534. doi: 10.1002/cncr.30630. Epub 2017 Feb 14.

Thyroid cancer incidence patterns in the United States by histologic type, 1992-2006.美国 1992-2006 年按组织学类型划分的甲状腺癌发病模式。

Thyroid. 2011 Feb;21(2):125-34. doi: 10.1089/thy.2010.0021. Epub 2010 Dec 27.

Geographical clustering of lung cancer in the province of Lecce, Italy: 1992-2001.意大利莱切省肺癌的地理聚集性：1992-2001 年。

Int J Health Geogr. 2009 Jul 1;8:40. doi: 10.1186/1476-072X-8-40.

Increase in the incidence of differentiated thyroid carcinoma in children, adolescents, and young adults: a population-based study.儿童、青少年和青年人群中分化型甲状腺癌发病率的增加：一项基于人群的研究。

J Pediatr. 2014 Jun;164(6):1481-5. doi: 10.1016/j.jpeds.2014.01.059. Epub 2014 Mar 12.

Age-Specific Incidence of Melanoma in the United States.美国特定年龄段的黑色素瘤发病率。

JAMA Dermatol. 2020 Jan 1;156(1):57-64. doi: 10.1001/jamadermatol.2019.3353.

Annual report to the nation on the status of cancer, 1975-2005, featuring trends in lung cancer, tobacco use, and tobacco control.《1975 - 2005年美国癌症现状年度报告》，重点关注肺癌、烟草使用及烟草控制的趋势

J Natl Cancer Inst. 2008 Dec 3;100(23):1672-94. doi: 10.1093/jnci/djn389. Epub 2008 Nov 25.

Increasing thyroid cancer incidence in Lithuania in 1978-2003.1978年至2003年立陶宛甲状腺癌发病率上升。

BMC Cancer. 2006 Dec 11;6:284. doi: 10.1186/1471-2407-6-284.

引用本文的文献

Epidemiological Trends and Projection of Liver Cancer Due to Nonalcoholic Steatohepatitis Among People Aged 55 Years and Older in China From 1990 to 2030: An Analysis of the Global Burden of Disease Study 2021.1990年至2030年中国55岁及以上人群非酒精性脂肪性肝炎所致肝癌的流行病学趋势及预测：全球疾病负担研究2021分析

Clin Transl Gastroenterol. 2025 Jun 11;16(8):e00872. doi: 10.14309/ctg.0000000000000872. eCollection 2025 Aug 1.

Pulmonary Embolism-Related Mortality in Patients With Cancer.癌症患者中与肺栓塞相关的死亡率

JAMA Netw Open. 2025 Feb 3;8(2):e2460315. doi: 10.1001/jamanetworkopen.2024.60315.

Temporal and geographical dynamics of early-onset Parkinson's disease burden: insights from the Global Burden of Disease Study 2021.早发性帕金森病负担的时间和地理动态：来自《2021年全球疾病负担研究》的见解

Front Neurol. 2025 Jan 30;16:1473548. doi: 10.3389/fneur.2025.1473548. eCollection 2025.

Trends in Early-Onset Colorectal Cancer in Singapore: Epidemiological Study of a Multiethnic Population.新加坡早发性结直肠癌趋势：多民族人群的流行病学研究

JMIR Public Health Surveill. 2025 Feb 14;11:e62835. doi: 10.2196/62835.

Negative excess oral and pharyngeal cancer mortality in Europe during the early pandemic years.疫情早期欧洲口腔和咽癌超额死亡率为负。

Oral Dis. 2025 Jan;31(1):121-128. doi: 10.1111/odi.15055. Epub 2024 Jun 27.

Impact of the pandemic and concomitant COVID-19 on the management and outcomes of middle cerebral artery strokes: a nationwide registry-based study.大流行和伴随的 COVID-19 对大脑中动脉中风的管理和结局的影响：一项基于全国登记的研究。

BMJ Open. 2024 Feb 27;14(2):e080738. doi: 10.1136/bmjopen-2023-080738.

Urban-sub-urban-rural variation in the supply and demand of emergency medical services.城乡急诊医疗服务的供需差异。

Front Public Health. 2023 Jan 25;10:1064385. doi: 10.3389/fpubh.2022.1064385. eCollection 2022.

Analysis of the disease burden trend of malignant tumors of the female reproductive system in China from 2006 to 2020.分析 2006 年至 2020 年中国女性生殖系统恶性肿瘤疾病负担趋势。

BMC Womens Health. 2022 Dec 7;22(1):504. doi: 10.1186/s12905-022-02104-2.

Accelerating Decreases in the Incidences of Hepatocellular Carcinoma at a Younger Age in Shanghai Are Associated With Hepatitis B Virus Vaccination.上海年轻人群肝细胞癌发病率加速下降与乙肝病毒疫苗接种有关。

Front Oncol. 2022 Apr 4;12:855945. doi: 10.3389/fonc.2022.855945. eCollection 2022.

Trends in surgical treatment of early-stage breast cancer reveal decreasing mastectomy use between 2003 and 2016 by age, race, and rurality.早期乳腺癌手术治疗趋势显示，2003 年至 2016 年间，年龄、种族和农村地区的乳房切除术使用率呈下降趋势。

Breast Cancer Res Treat. 2022 Jun;193(2):445-454. doi: 10.1007/s10549-022-06564-w. Epub 2022 Mar 14.

本文引用的文献

The prostate cancer conundrum revisited: treatment changes and prostate cancer mortality declines.重新审视前列腺癌难题：治疗变化与前列腺癌死亡率下降。

Cancer. 2012 Dec 1;118(23):5955-63. doi: 10.1002/cncr.27594. Epub 2012 May 17.

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION.选择分段线性回归中的变点数量。

Stat Sin. 2009 May 1;19(2):597-609.

The clustering of regression models method with applications in gene expression data.回归模型聚类方法及其在基因表达数据中的应用

Biometrics. 2006 Jun;62(2):526-33. doi: 10.1111/j.1541-0420.2005.00498.x.

Increasing incidence of thyroid cancer in the United States, 1973-2002.1973年至2002年美国甲状腺癌发病率上升情况。

JAMA. 2006 May 10;295(18):2164-7. doi: 10.1001/jama.295.18.2164.

Comparability of segmented line regression models.分段线性回归模型的可比性。

Biometrics. 2004 Dec;60(4):1005-14. doi: 10.1111/j.0006-341X.2004.00256.x.

Permutation tests for joinpoint regression with applications to cancer rates.用于连接点回归的排列检验及其在癌症发病率中的应用。

Stat Med. 2000 Feb 15;19(3):335-51. doi: 10.1002/(sici)1097-0258(20000215)19:3<335::aid-sim336>3.0.co;2-z.

Cancer surveillance series: interpreting trends in prostate cancer--part II: Cause of death misclassification and the recent rise and fall in prostate cancer mortality.癌症监测系列：解读前列腺癌的趋势——第二部分：死亡原因的错误分类以及前列腺癌死亡率近期的上升与下降

J Natl Cancer Inst. 1999 Jun 16;91(12):1025-32. doi: 10.1093/jnci/91.12.1025.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验