• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种通过快速核学习方法整合多组学数据来揭示癌症亚型的精确工具。

An Accurate Tool for Uncovering Cancer Subtypes by Fast Kernel Learning Method to Integrate Multiple Profile Data.

作者信息

Zhang Hongyu, Jiang Limin, Tang Jijun, Ding Yijie

机构信息

School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.

Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States.

出版信息

Front Cell Dev Biol. 2021 Mar 5;9:615747. doi: 10.3389/fcell.2021.615747. eCollection 2021.

DOI:10.3389/fcell.2021.615747
PMID:33763416
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7982914/
Abstract

In recent years, cancer has become a severe threat to human health. If we can accurately identify the subtypes of cancer, it will be of great significance to the research of anti-cancer drugs, the development of personalized treatment methods, and finally conquer cancer. In this paper, we obtain three feature representation datasets (gene expression profile, isoform expression and DNA methylation data) on lung cancer and renal cancer from the Broad GDAC, which collects the standardized data extracted from The Cancer Genome Atlas (TCGA). Since the feature dimension is too large, Principal Component Analysis (PCA) is used to reduce the feature vector, thus eliminating the redundant features and speeding up the operation speed of the classification model. By multiple kernel learning (MKL), we use Kernel target alignment (KTA), fast kernel learning (FKL), Hilbert-Schmidt Independence Criterion (HSIC), Mean to calculate the weight of kernel fusion. Finally, we put the combined kernel function into the support vector machine (SVM) and get excellent results. Among them, in the classification of renal cell carcinoma subtypes, the maximum accuracy can reach 0.978 by using the method of MKL (HSIC calculation weight), while in the classification of lung cancer subtypes, the accuracy can even reach 0.990 with the same method (FKL calculation weight).

摘要

近年来,癌症已成为对人类健康的严重威胁。如果我们能够准确识别癌症的亚型,这对于抗癌药物的研究、个性化治疗方法的开发以及最终攻克癌症都将具有重要意义。在本文中,我们从Broad GDAC获取了关于肺癌和肾癌的三个特征表示数据集(基因表达谱、异构体表达和DNA甲基化数据),该数据库收集了从癌症基因组图谱(TCGA)中提取的标准化数据。由于特征维度过大,我们使用主成分分析(PCA)来减少特征向量,从而消除冗余特征并加快分类模型的运算速度。通过多核学习(MKL),我们使用核目标对齐(KTA)、快速核学习(FKL)、希尔伯特-施密特独立性准则(HSIC)、均值来计算核融合的权重。最后,我们将组合后的核函数放入支持向量机(SVM)中并取得了优异的结果。其中,在肾细胞癌亚型分类中,使用MKL(HSIC计算权重)方法的最大准确率可达0.978,而在肺癌亚型分类中,使用相同方法(FKL计算权重)准确率甚至可达0.990。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/2edc2aa896d6/fcell-09-615747-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/ddbf4218d439/fcell-09-615747-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/62c8d403c8da/fcell-09-615747-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/489a38e1ab3c/fcell-09-615747-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/dfa95982b1ee/fcell-09-615747-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/2edc2aa896d6/fcell-09-615747-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/ddbf4218d439/fcell-09-615747-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/62c8d403c8da/fcell-09-615747-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/489a38e1ab3c/fcell-09-615747-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/dfa95982b1ee/fcell-09-615747-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa60/7982914/2edc2aa896d6/fcell-09-615747-g005.jpg

相似文献

1
An Accurate Tool for Uncovering Cancer Subtypes by Fast Kernel Learning Method to Integrate Multiple Profile Data.一种通过快速核学习方法整合多组学数据来揭示癌症亚型的精确工具。
Front Cell Dev Biol. 2021 Mar 5;9:615747. doi: 10.3389/fcell.2021.615747. eCollection 2021.
2
Multi-Omics Data Fusion via a Joint Kernel Learning Model for Cancer Subtype Discovery and Essential Gene Identification.基于联合核学习模型的多组学数据融合用于癌症亚型发现和关键基因识别
Front Genet. 2021 Mar 4;12:647141. doi: 10.3389/fgene.2021.647141. eCollection 2021.
3
Discovering Cancer Subtypes via an Accurate Fusion Strategy on Multiple Profile Data.通过对多组学数据采用精确融合策略发现癌症亚型。
Front Genet. 2019 Feb 5;10:20. doi: 10.3389/fgene.2019.00020. eCollection 2019.
4
Incorporating EBO-HSIC with SVM for Gene Selection Associated with Cervical Cancer Classification.将 EBO-HSIC 与 SVM 相结合,用于选择与宫颈癌分类相关的基因。
J Med Syst. 2018 Oct 6;42(11):225. doi: 10.1007/s10916-018-1092-5.
5
Efficient Multiple Kernel Learning Algorithms Using Low-Rank Representation.基于低秩表示的高效多核学习算法
Comput Intell Neurosci. 2017;2017:3678487. doi: 10.1155/2017/3678487. Epub 2017 Aug 22.
6
Automatic plankton image classification combining multiple view features via multiple kernel learning.基于多核学习的多视角特征融合浮游生物图像自动分类
BMC Bioinformatics. 2017 Dec 28;18(Suppl 16):570. doi: 10.1186/s12859-017-1954-8.
7
Stratifying patients using fast multiple kernel learning framework: case studies of Alzheimer's disease and cancers.利用快速多核学习框架对患者进行分层:阿尔茨海默病和癌症的案例研究。
BMC Med Inform Decis Mak. 2020 Jun 16;20(1):108. doi: 10.1186/s12911-020-01140-y.
8
Reduced multiple empirical kernel learning machine.简化的多重经验核学习机
Cogn Neurodyn. 2015 Feb;9(1):63-73. doi: 10.1007/s11571-014-9304-2. Epub 2014 Jul 29.
9
Direct Kernel Perceptron (DKP): ultra-fast kernel ELM-based classification with non-iterative closed-form weight calculation.直接核感知机(DKP):基于超快速核极限学习机的分类方法,具有非迭代的闭式权重计算。
Neural Netw. 2014 Feb;50:60-71. doi: 10.1016/j.neunet.2013.11.002. Epub 2013 Nov 14.
10
Pulmonary Nodule Recognition Based on Multiple Kernel Learning Support Vector Machine-PSO.基于多核学习支持向量机-粒子群优化算法的肺结节识别
Comput Math Methods Med. 2018 Apr 29;2018:1461470. doi: 10.1155/2018/1461470. eCollection 2018.

本文引用的文献

1
A Novel Triple Matrix Factorization Method for Detecting Drug-Side Effect Association Based on Kernel Target Alignment.一种基于核目标对准的新型三重矩阵分解方法,用于检测药物副作用关联。
Biomed Res Int. 2020 May 28;2020:4675395. doi: 10.1155/2020/4675395. eCollection 2020.
2
Identification of expression signatures for non-small-cell lung carcinoma subtype classification.鉴定非小细胞肺癌亚型分类的表达特征。
Bioinformatics. 2020 Jan 15;36(2):339-346. doi: 10.1093/bioinformatics/btz557.
3
Integrating Bipartite Network Projection and KATZ Measure to Identify Novel CircRNA-Disease Associations.
整合二分网络投影和 KATZ 测度识别新型 circRNA-疾病关联
IEEE Trans Nanobioscience. 2019 Oct;18(4):578-584. doi: 10.1109/TNB.2019.2922214. Epub 2019 Jun 12.
4
Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions.用于预测lncRNA-蛋白质相互作用的基于快速核学习到核岭回归的多变量信息融合
Front Genet. 2019 Jan 15;9:716. doi: 10.3389/fgene.2018.00716. eCollection 2018.
5
Identification of Drug-Side Effect Association via Semisupervised Model and Multiple Kernel Learning.基于半监督模型和多核学习的药物副作用关联识别。
IEEE J Biomed Health Inform. 2019 Nov;23(6):2619-2632. doi: 10.1109/JBHI.2018.2883834. Epub 2018 Nov 28.
6
Prediction of potential disease-associated microRNAs using structural perturbation method.基于结构扰动方法预测潜在疾病相关 microRNAs。
Bioinformatics. 2018 Jul 15;34(14):2425-2432. doi: 10.1093/bioinformatics/bty112.
7
The biology and management of non-small cell lung cancer.非小细胞肺癌的生物学特性与治疗管理。
Nature. 2018 Jan 24;553(7689):446-454. doi: 10.1038/nature25183.
8
Discovery of a novel target for cancer: PRR14.发现癌症的一个新靶点:PRR14。
Cell Death Dis. 2016 Dec 1;7(12):e2502. doi: 10.1038/cddis.2016.401.
9
An Expression Signature as an Aid to the Histologic Classification of Non-Small Cell Lung Cancer.一种作为非小细胞肺癌组织学分类辅助手段的表达特征。
Clin Cancer Res. 2016 Oct 1;22(19):4880-4889. doi: 10.1158/1078-0432.CCR-15-2900. Epub 2016 Jun 28.
10
The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge.癌症基因组图谱(TCGA):一个不可估量的知识来源。
Contemp Oncol (Pozn). 2015;19(1A):A68-77. doi: 10.5114/wo.2014.47136.