• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于癌症特征的K均值聚类模型

*K-means and cluster models for cancer signatures.

作者信息

Kakushadze Zura, Yu Willie

机构信息

Quantigic® Solutions LLC, 1127 High Ridge Road #135, Stamford, CT 06905, United States.

Free University of Tbilisi, Business School & School of Physics, 240, David Agmashenebeli Alley, Tbilisi 0159, Georgia.

出版信息

Biomol Detect Quantif. 2017 Aug 2;13:7-31. doi: 10.1016/j.bdq.2017.07.001. eCollection 2017 Sep.

DOI:10.1016/j.bdq.2017.07.001
PMID:29021969
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5634820/
Abstract

We present *K-means clustering algorithm and source code by expanding statistical clustering methods applied in https://ssrn.com/abstract=2802753 to quantitative finance. *K-means is statistically deterministic without specifying initial centers, etc. We apply *K-means to extracting cancer signatures from genome data without using nonnegative matrix factorization (NMF). *K-means' computational cost is a fraction of NMF's. Using 1389 published samples for 14 cancer types, we find that 3 cancers (liver cancer, lung cancer and renal cell carcinoma) stand out and do not have cluster-like structures. Two clusters have especially high within-cluster correlations with 11 other cancers indicating common underlying structures. Our approach opens a novel avenue for studying such structures. *K-means is universal and can be applied in other fields. We discuss some potential applications in quantitative finance.

摘要

我们通过将应用于https://ssrn.com/abstract=2802753的统计聚类方法扩展到量化金融领域,展示了K均值聚类算法及源代码。K均值在不指定初始中心等情况下具有统计确定性。我们将K均值应用于从基因组数据中提取癌症特征,而不使用非负矩阵分解(NMF)。K均值的计算成本只是NMF的一小部分。利用14种癌症类型的1389个已发表样本,我们发现有3种癌症(肝癌、肺癌和肾细胞癌)较为突出,不具有聚类结构。两个聚类与其他11种癌症具有特别高的类内相关性,表明存在共同的潜在结构。我们的方法为研究此类结构开辟了一条新途径。K均值具有通用性,可应用于其他领域。我们讨论了在量化金融中的一些潜在应用。

相似文献

1
*K-means and cluster models for cancer signatures.用于癌症特征的K均值聚类模型
Biomol Detect Quantif. 2017 Aug 2;13:7-31. doi: 10.1016/j.bdq.2017.07.001. eCollection 2017 Sep.
2
Mutation Clusters from Cancer Exome.
Genes (Basel). 2017 Aug 15;8(8):201. doi: 10.3390/genes8080201.
3
Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis.通过非负矩阵分解减少微阵列数据以进行可视化和聚类分析。
J Biomed Inform. 2008 Aug;41(4):602-6. doi: 10.1016/j.jbi.2007.12.003. Epub 2007 Dec 23.
4
Generalized Separable Nonnegative Matrix Factorization.广义可分离非负矩阵分解
IEEE Trans Pattern Anal Mach Intell. 2021 May;43(5):1546-1561. doi: 10.1109/TPAMI.2019.2956046. Epub 2021 Apr 1.
5
Manifold Peaks Nonnegative Matrix Factorization.流形峰值非负矩阵分解
IEEE Trans Neural Netw Learn Syst. 2024 May;35(5):6850-6862. doi: 10.1109/TNNLS.2022.3212922. Epub 2024 May 2.
6
Convex nonnegative matrix factorization with manifold regularization.具有流形正则化的凸非负矩阵分解。
Neural Netw. 2015 Mar;63:94-103. doi: 10.1016/j.neunet.2014.11.007. Epub 2014 Dec 4.
7
Does Determination of Initial Cluster Centroids Improve the Performance of -Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study.初始聚类质心的确定是否能提高 -Means 聚类算法的性能?在应用研究中,通过遗传算法、最小生成树和层次聚类三种混合方法的比较。
Comput Math Methods Med. 2020 Aug 1;2020:7636857. doi: 10.1155/2020/7636857. eCollection 2020.
8
Hessian regularization based symmetric nonnegative matrix factorization for clustering gene expression and microbiome data.基于Hessian正则化的对称非负矩阵分解用于聚类基因表达和微生物组数据
Methods. 2016 Dec 1;111:80-84. doi: 10.1016/j.ymeth.2016.06.017. Epub 2016 Jun 20.
9
Hybrid Clustering of Single-Cell Gene Expression and Spatial Information Integrated NMF and K-Means.单细胞基因表达与空间信息的混合聚类:集成非负矩阵分解和K均值算法
Front Genet. 2021 Nov 8;12:763263. doi: 10.3389/fgene.2021.763263. eCollection 2021.
10
Clinical Documents Clustering Based on Medication/Symptom Names Using Multi-View Nonnegative Matrix Factorization.基于多视图非负矩阵分解的用药/症状名称临床文档聚类
IEEE Trans Nanobioscience. 2015 Jul;14(5):500-4. doi: 10.1109/TNB.2015.2422612. Epub 2015 May 21.

引用本文的文献

1
Reliable epithelial-mesenchymal transition biomarkers for colorectal cancer detection.可靠的结直肠癌检测上皮-间充质转化标志物。
Biomark Med. 2022 Aug;16(12):889-901. doi: 10.2217/bmm-2022-0071. Epub 2022 Jul 27.
2
lncRNA Profiles Enable Prognosis Prediction and Subtyping for Esophageal Squamous Cell Carcinoma.长链非编码RNA图谱有助于食管鳞状细胞癌的预后预测和亚型分类。
Front Cell Dev Biol. 2021 May 28;9:656554. doi: 10.3389/fcell.2021.656554. eCollection 2021.
3
A tumor microenvironment-specific gene expression signature predicts chemotherapy resistance in colorectal cancer patients.

本文引用的文献

1
Whole-genome mutational landscape and characterization of noncoding and structural mutations in liver cancer.肝癌全基因组突变全景及非编码区和结构突变特征。
Nat Genet. 2016 May;48(5):500-9. doi: 10.1038/ng.3547. Epub 2016 Apr 11.
2
Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.全基因组测序揭示食管鳞状细胞癌结构变异的多种模式。
Am J Hum Genet. 2016 Feb 4;98(2):256-74. doi: 10.1016/j.ajhg.2015.12.013. Epub 2016 Jan 28.
3
Clusters of Multiple Mutations: Incidence and Molecular Mechanisms.
一种肿瘤微环境特异性基因表达特征可预测结直肠癌患者的化疗耐药性。
NPJ Precis Oncol. 2021 Feb 12;5(1):7. doi: 10.1038/s41698-021-00142-x.
4
Machine Learning in Oncology: Methods, Applications, and Challenges.肿瘤学中的机器学习:方法、应用与挑战。
JCO Clin Cancer Inform. 2020 Oct;4:885-894. doi: 10.1200/CCI.20.00072.
5
Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.降维与聚类模型在单细胞 RNA 测序数据中的应用:一项比较研究。
Int J Mol Sci. 2020 Mar 22;21(6):2181. doi: 10.3390/ijms21062181.
6
Mutation Clusters from Cancer Exome.
Genes (Basel). 2017 Aug 15;8(8):201. doi: 10.3390/genes8080201.
多重突变簇:发生率及分子机制
Annu Rev Genet. 2015;49:243-67. doi: 10.1146/annurev-genet-112414-054714.
4
Non-coding recurrent mutations in chronic lymphocytic leukaemia.慢性淋巴细胞白血病中的非编码重现性突变。
Nature. 2015 Oct 22;526(7574):519-24. doi: 10.1038/nature14666. Epub 2015 Jul 22.
5
Whole-genome characterization of chemoresistant ovarian cancer.耐药性卵巢癌的全基因组特征分析。
Nature. 2015 May 28;521(7553):489-94. doi: 10.1038/nature14410.
6
The evolutionary history of lethal metastatic prostate cancer.致死性转移性前列腺癌的进化史。
Nature. 2015 Apr 16;520(7547):353-357. doi: 10.1038/nature14347. Epub 2015 Apr 1.
7
Whole genomes redefine the mutational landscape of pancreatic cancer.全基因组重新定义了胰腺癌的突变格局。
Nature. 2015 Feb 26;518(7540):495-501. doi: 10.1038/nature14169.
8
Hypermutation in human cancer genomes: footprints and mechanisms.人类癌症基因组中的高突变:印记与机制
Nat Rev Cancer. 2014 Dec;14(12):786-800. doi: 10.1038/nrc3816.
9
AID expression in B-cell lymphomas causes accumulation of genomic uracil and a distinct AID mutational signature.B细胞淋巴瘤中的AID表达导致基因组尿嘧啶积累和独特的AID突变特征。
DNA Repair (Amst). 2015 Jan;25:60-71. doi: 10.1016/j.dnarep.2014.11.006. Epub 2014 Nov 24.
10
B cell super-enhancers and regulatory clusters recruit AID tumorigenic activity.B细胞超级增强子和调控簇募集AID的致瘤活性。
Cell. 2014 Dec 18;159(7):1524-37. doi: 10.1016/j.cell.2014.11.013. Epub 2014 Dec 4.