• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于高斯核和差分隐私的联合聚类算法在肺癌识别中的应用

Application of the joint clustering algorithm based on Gaussian kernels and differential privacy in lung cancer identification.

作者信息

Yanping Hang, Haixia Zheng, Minmin Yang, Nan Wang, Miaomiao Kong, Mingming Zhao

机构信息

Department of Respiratory and Critical Care Medicine, Affiliated Nanjing Gaochun People's Hospital, Jiangsu University, Nanjing, 210000, Jiangsu, China.

出版信息

Sci Rep. 2025 May 16;15(1):17094. doi: 10.1038/s41598-025-01873-8.

DOI:10.1038/s41598-025-01873-8
PMID:40379735
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12084312/
Abstract

In the age of big data, privacy, particularly medical data privacy, is becoming increasingly important. Differential privacy (DP) has emerged as a key method for safeguarding privacy during data analysis and publishing. Cancer identification and classification play a vital role in early detection and treatment. This paper introduces a novel algorithm, DPFCM_GK, which combines differential privacy with fuzzy c-means (FCM) clustering using a Gaussian kernel function. The algorithm enhances cancer detection while ensuring data privacy. Three publicly available lung cancer datasets, along with a dataset from our hospital, are used to test and demonstrate the effectiveness of DPFCM_GK. The experimental results show that DPFCM_GK achieves high clustering accuracy and enhanced privacy as the privacy budget (ε) increases. For the UCIML, NLST, and NSCLC datasets, it reaches optimal results at lower ε (1.52, 1.24, and 2.32) compared to DPFCM. In the lung cancer dataset, DPFCM_GK outperforms DPFCM within, 0.05 ≤ ε ≤ 2.5, with significant differences (χ = 4.54 ∼ 29.12; P < 0.05), and both methods converge to an accuracy of 94.5% as ε increases. Although differential privacy initially increases iteration counts, DPFCM_GK demonstrates faster convergence and fewer iterations compared to DPFCM, with significant reductions (T= 23.08, 43.47, and 48.93; P<0.05). For the UCIML dataset, DPFCM_GK significantly reduces runtime compared to other models (DPFCM, LDP-SGD, LDP-Fed, LDP-FedSGD, MGM-DPL, LDP-FL) under the same privacy budget. The runtime reduction is statistically significant with T-values of (T = 21.08, 316.24, 102.35, 222.37, 162.23, 159.25; P < 0.05). DPFCM_GK still maintains excellent time efficiency when applied to the NLST and NSCLC datasets(P < 0.05). For the LLCS dataset, For the LLCS dataset, the DPFCM_GK demonstrates significant improvement as the privacy budget increases, especially in low-budget scenarios, where the performance gap is most pronounced (T=4.20, 8.44, 10.92, 3.95, 7.16, 8.51, P < 0.05). These results confirm DPFCM_GK as a practical solution for medical data analysis, balancing accuracy, privacy, and efficiency.

摘要

在大数据时代,隐私,尤其是医疗数据隐私,正变得越来越重要。差分隐私(DP)已成为在数据分析和发布过程中保护隐私的关键方法。癌症识别和分类在早期检测和治疗中起着至关重要的作用。本文介绍了一种新颖的算法DPFCM_GK,它将差分隐私与使用高斯核函数的模糊c均值(FCM)聚类相结合。该算法在确保数据隐私的同时增强了癌症检测能力。使用三个公开可用的肺癌数据集以及我们医院的一个数据集来测试和证明DPFCM_GK的有效性。实验结果表明,随着隐私预算(ε)的增加,DPFCM_GK实现了高聚类准确率并增强了隐私保护。对于UCIML、NLST和NSCLC数据集,与DPFCM相比,它在较低的ε(1.52、1.24和2.32)下达到了最优结果。在肺癌数据集中,在0.05≤ε≤2.5范围内,DPFCM_GK优于DPFCM,差异显著(χ = 4.54 ∼ 29.12;P < 0.05),并且随着ε的增加,两种方法都收敛到94.5%的准确率。尽管差分隐私最初会增加迭代次数,但与DPFCM相比,DPFCM_GK收敛更快且迭代次数更少,有显著减少(T = 23.08、43.47和48.93;P < 0.05)。对于UCIML数据集,在相同隐私预算下,与其他模型(DPFCM、LDP - SGD、LDP - Fed、LDP - FedSGD、MGM - DPL、LDP - FL)相比,DPFCM_GK显著减少了运行时间。运行时间的减少具有统计学意义,T值为(T = 21.08、316.24、102.35、222.37、162.23、159.25;P < 0.05)。当应用于NLST和NSCLC数据集时,DPFCM_GK仍然保持出色的时间效率(P < 0.05)。对于LLCS数据集,随着隐私预算的增加,DPFCM_GK表现出显著改善,尤其是在低预算场景中,性能差距最为明显(T = 4.20、8.44、10.92、3.95、7.16、8.51,P < 0.05)。这些结果证实DPFCM_GK是医疗数据分析的一种实用解决方案,在准确性、隐私和效率之间取得了平衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/a5c434e521b8/41598_2025_1873_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/ec7796201209/41598_2025_1873_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/8d943382dc1a/41598_2025_1873_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/484abdc6272f/41598_2025_1873_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/23f2677d3c86/41598_2025_1873_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/bfa1bc1a47b6/41598_2025_1873_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/90764c67d66e/41598_2025_1873_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/a5c434e521b8/41598_2025_1873_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/ec7796201209/41598_2025_1873_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/8d943382dc1a/41598_2025_1873_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/484abdc6272f/41598_2025_1873_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/23f2677d3c86/41598_2025_1873_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/bfa1bc1a47b6/41598_2025_1873_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/90764c67d66e/41598_2025_1873_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0af9/12084312/a5c434e521b8/41598_2025_1873_Fig7_HTML.jpg

相似文献

1
Application of the joint clustering algorithm based on Gaussian kernels and differential privacy in lung cancer identification.基于高斯核和差分隐私的联合聚类算法在肺癌识别中的应用
Sci Rep. 2025 May 16;15(1):17094. doi: 10.1038/s41598-025-01873-8.
2
Differential privacy fuzzy C-means clustering algorithm based on gaussian kernel function.基于高斯核函数的差分隐私模糊 C-均值聚类算法。
PLoS One. 2021 Mar 23;16(3):e0248737. doi: 10.1371/journal.pone.0248737. eCollection 2021.
3
A(DP) SGD: Asynchronous Decentralized Parallel Stochastic Gradient Descent With Differential Privacy.异步去中心化并行随机梯度下降与差分隐私。
IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):8036-8047. doi: 10.1109/TPAMI.2021.3107796. Epub 2022 Oct 4.
4
A differential privacy protecting K-means clustering algorithm based on contour coefficients.基于轮廓系数的差分隐私保护 K-均值聚类算法。
PLoS One. 2018 Nov 21;13(11):e0206832. doi: 10.1371/journal.pone.0206832. eCollection 2018.
5
FAItH: Federated Analytics and Integrated Differential Privacy with Clustering for Healthcare Monitoring.FAItH:用于医疗监测的联合分析与集成差分隐私聚类方法
Sci Rep. 2025 Mar 24;15(1):10155. doi: 10.1038/s41598-025-94501-4.
6
Sparsified federated learning with differential privacy for intrusion detection in VANETs based on Fisher Information Matrix.基于 Fisher 信息矩阵的 VANET 入侵检测的稀疏联邦学习与差分隐私。
PLoS One. 2024 Apr 17;19(4):e0301897. doi: 10.1371/journal.pone.0301897. eCollection 2024.
7
Personal health data protection and intelligent healthcare applications under generative adversarial network.生成对抗网络下的个人健康数据保护与智能医疗应用
Sci Rep. 2025 May 13;15(1):16558. doi: 10.1038/s41598-025-01575-1.
8
PPPCT: Privacy-Preserving framework for Parallel Clustering Transcriptomics data.PPPCT:用于平行聚类转录组学数据的隐私保护框架。
Comput Biol Med. 2024 May;173:108351. doi: 10.1016/j.compbiomed.2024.108351. Epub 2024 Mar 21.
9
Detecting clusters of different geometrical shapes in microarray gene expression data.在微阵列基因表达数据中检测不同几何形状的聚类。
Bioinformatics. 2005 May 1;21(9):1927-34. doi: 10.1093/bioinformatics/bti251. Epub 2005 Jan 12.
10
An improved fuzzy C-means clustering algorithm for assisted therapy of chronic bronchitis.一种用于慢性支气管炎辅助治疗的改进型模糊C均值聚类算法。
Technol Health Care. 2015;23(6):699-713. doi: 10.3233/THC-151023.

本文引用的文献

1
An Intuitionistic Fuzzy C-Means and Local Information-Based DCT Filtering for Fast Brain MRI Segmentation.基于直觉模糊 C 均值和局部信息的 DCT 滤波在快速脑 MRI 分割中的应用。
J Imaging Inform Med. 2024 Oct;37(5):2287-2310. doi: 10.1007/s10278-023-00899-6. Epub 2024 Apr 22.
2
A Survey on Differential Privacy for Medical Data Analysis.医学数据分析中的差分隐私研究
Ann Data Sci. 2023 Jun 10:1-15. doi: 10.1007/s40745-023-00475-3.
3
Using Fuzzy C-Means Clustering to Determine First Arrival of Microseismic Recordings.使用模糊C均值聚类确定微震记录的初至波
Sensors (Basel). 2024 Mar 5;24(5):1682. doi: 10.3390/s24051682.
4
Differential privacy in collaborative filtering recommender systems: a review.协同过滤推荐系统中的差分隐私:综述
Front Big Data. 2023 Oct 12;6:1249997. doi: 10.3389/fdata.2023.1249997. eCollection 2023.
5
Distribution-Invariant Differential Privacy.分布不变差分隐私
J Econom. 2023 Aug;235(2):444-453. doi: 10.1016/j.jeconom.2022.05.004. Epub 2022 Jun 18.
6
Convolutional neural networks.卷积神经网络
Nat Methods. 2023 Sep;20(9):1269-1270. doi: 10.1038/s41592-023-01973-1.
7
Fuzzy C-Means Clustering: A Review of Applications in Breast Cancer Detection.模糊C均值聚类:乳腺癌检测中的应用综述
Entropy (Basel). 2023 Jul 4;25(7):1021. doi: 10.3390/e25071021.
8
Artificial intelligence in pharmaceutical regulatory affairs.人工智能在药品监管事务中的应用。
Drug Discov Today. 2023 Sep;28(9):103700. doi: 10.1016/j.drudis.2023.103700. Epub 2023 Jul 12.
9
Estimation of Radial Basis Function Network Centers via Information Forces.基于信息力的径向基函数网络中心估计
Entropy (Basel). 2022 Sep 23;24(10):1347. doi: 10.3390/e24101347.
10
Contrastive Multi-View Kernel Learning.对比多视角核学习。
IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):9552-9566. doi: 10.1109/TPAMI.2023.3253211. Epub 2023 Jun 30.