文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于 Pareto 最优解的 SVM 集成算法进行癌症亚型的多类聚类以识别基因标志物。

Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal solutions for gene marker identification.

机构信息

Department of Computer Science and Engineering, University of Kalyani, Kalyani, West Bengal, India.

出版信息

PLoS One. 2010 Nov 12;5(11):e13803. doi: 10.1371/journal.pone.0013803.


DOI:10.1371/journal.pone.0013803
PMID:21103052
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2980474/
Abstract

With the advancement of microarray technology, it is now possible to study the expression profiles of thousands of genes across different experimental conditions or tissue samples simultaneously. Microarray cancer datasets, organized as samples versus genes fashion, are being used for classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer types. In this article, we have presented an unsupervised cancer classification technique based on multiobjective genetic clustering of the tissue samples. In this regard, a real-coded encoding of the cluster centers is used and cluster compactness and separation are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. A novel approach to combine the clustering information possessed by the non-dominated solutions through Support Vector Machine (SVM) classifier has been proposed. Final clustering is obtained by consensus among the clusterings yielded by different kernel functions. The performance of the proposed multiobjective clustering method has been compared with that of several other microarray clustering algorithms for three publicly available benchmark cancer datasets. Moreover, statistical significance tests have been conducted to establish the statistical superiority of the proposed clustering method. Furthermore, relevant gene markers have been identified using the clustering result produced by the proposed clustering method and demonstrated visually. Biological relationships among the gene markers are also studied based on gene ontology. The results obtained are found to be promising and can possibly have important impact in the area of unsupervised cancer classification as well as gene marker identification for multiple cancer subtypes.

摘要

随着微阵列技术的进步,现在可以同时研究不同实验条件或组织样本中数千个基因的表达谱。微阵列癌症数据集以样本与基因的方式组织,用于将组织样本分类为良性和恶性或其亚型。它们还有助于识别每种癌症亚型的潜在基因标记,这有助于成功诊断特定类型的癌症。在本文中,我们提出了一种基于组织样本的多目标遗传聚类的无监督癌症分类技术。在这方面,使用了聚类中心的实码编码,并同时优化了聚类的紧凑性和分离性。所得的近 Pareto 最优解集包含了许多非支配解。提出了一种通过支持向量机(SVM)分类器结合非支配解所具有的聚类信息的新方法。最终的聚类是通过不同核函数产生的聚类之间的共识获得的。将提出的多目标聚类方法的性能与其他几种微阵列聚类算法在三个公开可用的基准癌症数据集上进行了比较。此外,还进行了统计意义检验,以确定所提出的聚类方法的统计优势。此外,还使用所提出的聚类方法产生的聚类结果识别了相关的基因标记,并进行了可视化展示。还基于基因本体研究了基因标记之间的生物学关系。所得结果很有希望,并且可能对无监督癌症分类以及多种癌症亚型的基因标记识别领域产生重要影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/dd92c4386d44/pone.0013803.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/3ec84b7256b8/pone.0013803.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/94a7af7774f5/pone.0013803.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/332fa92a072d/pone.0013803.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/dd92c4386d44/pone.0013803.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/3ec84b7256b8/pone.0013803.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/94a7af7774f5/pone.0013803.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/332fa92a072d/pone.0013803.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb59/2980474/dd92c4386d44/pone.0013803.g004.jpg

相似文献

[1]
Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal solutions for gene marker identification.

PLoS One. 2010-11-12

[2]
Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes.

BMC Bioinformatics. 2009-1-20

[3]
Multiobjective Simulated Annealing-Based Clustering of Tissue Samples for Cancer Diagnosis.

IEEE J Biomed Health Inform. 2015-2-20

[4]
Gene expression data analysis using multiobjective clustering improved with SVM based ensemble.

In Silico Biol. 2011

[5]
Gene-expression-based cancer subtypes prediction through feature selection and transductive SVM.

IEEE Trans Biomed Eng. 2012-10-18

[6]
Reliable classification of two-class cancer data using evolutionary algorithms.

Biosystems. 2003-11

[7]
Gene expression data clustering using a multiobjective symmetry based clustering technique.

Comput Biol Med. 2013-9-7

[8]
Selecting dissimilar genes for multi-class classification, an application in cancer subtyping.

BMC Bioinformatics. 2007-6-16

[9]
Simultaneous gene clustering and subset selection for sample classification via MDL.

Bioinformatics. 2003-6-12

[10]
A centroid-based gene selection method for microarray data classification.

J Theor Biol. 2016-7-7

引用本文的文献

[1]
Tumor classification and biomarker discovery based on the 5'isomiR expression level.

BMC Cancer. 2019-2-7

[2]
Continuity of transcriptomes among colorectal cancer subtypes based on meta-analysis.

Genome Biol. 2018-9-25

[3]
In-silico interaction-resolution pathway activity quantification and application to identifying cancer subtypes.

BMC Med Inform Decis Mak. 2016-7-18

[4]
Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning.

IEEE J Transl Eng Health Med. 2014-12-2

[5]
Contribution of bioinformatics prediction in microRNA-based cancer therapeutics.

Adv Drug Deliv Rev. 2015-1

[6]
A novel biclustering approach to association rule mining for predicting HIV-1-human protein interactions.

PLoS One. 2012-4-23

本文引用的文献

[1]
ASB9 interacts with ubiquitous mitochondrial creatine kinase and inhibits mitochondrial function.

BMC Biol. 2010-3-19

[2]
Clustering cancer gene expression data: a comparative study.

BMC Bioinformatics. 2008-11-27

[3]
Systematic bioinformatic analysis of expression levels of 17,330 human genes across 9,783 samples from 175 types of healthy and pathological tissues.

Genome Biol. 2008

[4]
An improved algorithm for clustering gene expression data.

Bioinformatics. 2007-11-1

[5]
Expression profiling of t(12;22) positive clear cell sarcoma of soft tissue cell lines reveals characteristic up-regulation of potential new marker genes including ERBB3.

Cancer Res. 2004-5-15

[6]
Multiclass classification of microarray data with repeated measurements: application to cancer.

Genome Biol. 2003

[7]
Reliable classification of two-class cancer data using evolutionary algorithms.

Biosystems. 2003-11

[8]
SPARC is a key Schwannian-derived inhibitor controlling neuroblastoma tumor angiogenesis.

Cancer Res. 2002-12-15

[9]
Principal component analysis for clustering gene expression data.

Bioinformatics. 2001-9

[10]
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks.

Nat Med. 2001-6

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索