• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于模糊信息粒的近似条件熵在基因表达数据分类中的特征选择

Feature Selection Using Approximate Conditional Entropy Based on Fuzzy Information Granule for Gene Expression Data Classification.

作者信息

Zhang Hengyi

机构信息

College of Animal Science and Technology, Northwest A&F University, Yangling, China.

出版信息

Front Genet. 2021 Mar 30;12:631505. doi: 10.3389/fgene.2021.631505. eCollection 2021.

DOI:10.3389/fgene.2021.631505
PMID:33859666
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8042210/
Abstract

Classification is widely used in gene expression data analysis. Feature selection is usually performed before classification because of the large number of genes and the small sample size in gene expression data. In this article, a novel feature selection algorithm using approximate conditional entropy based on fuzzy information granule is proposed, and the correctness of the method is proved by the monotonicity of entropy. Firstly, the fuzzy relation matrix is established by Laplacian kernel. Secondly, the approximately equal relation on fuzzy sets is defined. And then, the approximate conditional entropy based on fuzzy information granule and the importance of internal attributes are defined. Approximate conditional entropy can measure the uncertainty of knowledge from two different perspectives of information and algebra theory. Finally, the greedy algorithm based on the approximate conditional entropy is designed for feature selection. Experimental results for six large-scale gene datasets show that our algorithm not only greatly reduces the dimension of the gene datasets, but also is superior to five state-of-the-art algorithms in terms of classification accuracy.

摘要

分类在基因表达数据分析中被广泛应用。由于基因表达数据中基因数量众多且样本量小,特征选择通常在分类之前进行。本文提出了一种基于模糊信息粒的近似条件熵的新型特征选择算法,并通过熵的单调性证明了该方法的正确性。首先,利用拉普拉斯核建立模糊关系矩阵。其次,定义模糊集上的近似相等关系。然后,定义基于模糊信息粒的近似条件熵和内部属性的重要性。近似条件熵可以从信息和代数理论的两个不同角度衡量知识的不确定性。最后,设计了基于近似条件熵的贪心算法进行特征选择。对六个大规模基因数据集的实验结果表明,我们的算法不仅大大降低了基因数据集的维度,而且在分类准确率方面优于五种先进算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b6f/8042210/c2599b76dc96/fgene-12-631505-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b6f/8042210/c2599b76dc96/fgene-12-631505-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0b6f/8042210/c2599b76dc96/fgene-12-631505-g001.jpg

相似文献

1
Feature Selection Using Approximate Conditional Entropy Based on Fuzzy Information Granule for Gene Expression Data Classification.基于模糊信息粒的近似条件熵在基因表达数据分类中的特征选择
Front Genet. 2021 Mar 30;12:631505. doi: 10.3389/fgene.2021.631505. eCollection 2021.
2
Feature Genes Selection Using Fuzzy Rough Uncertainty Metric for Tumor Diagnosis.基于模糊粗糙不确定性度量的肿瘤诊断特征基因选择
Comput Math Methods Med. 2019 Jan 27;2019:6705648. doi: 10.1155/2019/6705648. eCollection 2019.
3
Multi-Feature Fusion Method Based on EEG Signal and its Application in Stroke Classification.基于 EEG 信号的多特征融合方法及其在中风分类中的应用。
J Med Syst. 2019 Dec 21;44(2):39. doi: 10.1007/s10916-019-1517-9.
4
A Fast Feature Selection Algorithm by Accelerating Computation of Fuzzy Rough Set-Based Information Entropy.一种通过加速基于模糊粗糙集的信息熵计算的快速特征选择算法。
Entropy (Basel). 2018 Oct 13;20(10):788. doi: 10.3390/e20100788.
5
Hybrid similarity relation based mutual information for feature selection in intuitionistic fuzzy rough framework and its applications.直觉模糊粗糙框架下基于混合相似关系的互信息特征选择及其应用
Sci Rep. 2024 Mar 12;14(1):5958. doi: 10.1038/s41598-024-55902-z.
6
Study on Hesitant Fuzzy Information Measures and Their Clustering Application.犹豫模糊信息测度及其聚类应用研究。
Comput Intell Neurosci. 2019 Mar 3;2019:5370763. doi: 10.1155/2019/5370763. eCollection 2019.
7
A threshold fuzzy entropy based feature selection for medical database classification.基于阈值模糊熵的医学数据库分类特征选择。
Comput Biol Med. 2013 Dec;43(12):2222-9. doi: 10.1016/j.compbiomed.2013.10.016. Epub 2013 Oct 25.
8
An efficient fuzzy classifier with feature selection based on fuzzy entropy.一种基于模糊熵的带特征选择的高效模糊分类器。
IEEE Trans Syst Man Cybern B Cybern. 2001;31(3):426-32. doi: 10.1109/3477.931536.
9
Assembling A Multi-Feature EEG Classifier for Left-Right Motor Imagery Data Using Wavelet-Based Fuzzy Approximate Entropy for Improved Accuracy.使用基于小波的模糊近似熵提高精度,组装用于左右运动想象数据的多特征 EEG 分类器。
Int J Neural Syst. 2015 Dec;25(8):1550037. doi: 10.1142/S0129065715500379. Epub 2015 Sep 30.
10
An automated detection of epileptic seizures EEG using CNN classifier based on feature fusion with high accuracy.基于特征融合的 CNN 分类器的 EEG 癫痫自动检测,具有高精度。
BMC Med Inform Decis Mak. 2023 May 22;23(1):96. doi: 10.1186/s12911-023-02180-w.

引用本文的文献

1
A Feature Selection Method Based on Graph Theory for Cancer Classification.一种基于图论的癌症分类特征选择方法
Comb Chem High Throughput Screen. 2024;27(5):650-660. doi: 10.2174/1386207326666230413085646.

本文引用的文献

1
A Neighborhood Rough Sets-Based Attribute Reduction Method Using Lebesgue and Entropy Measures.一种基于邻域粗糙集的使用勒贝格测度和熵测度的属性约简方法。
Entropy (Basel). 2019 Feb 1;21(2):138. doi: 10.3390/e21020138.
2
Multi-Attribute Decision-Making Based on Bonferroni Mean Operators under Cubic Intuitionistic Fuzzy Set Environment.基于立方直觉模糊集环境下Bonferroni均值算子的多属性决策
Entropy (Basel). 2018 Jan 17;20(1):65. doi: 10.3390/e20010065.
3
Gene selection for tumor classification using neighborhood rough sets and entropy measures.
基于邻域粗糙集和熵测度的肿瘤分类基因选择
J Biomed Inform. 2017 Mar;67:59-68. doi: 10.1016/j.jbi.2017.02.007. Epub 2017 Feb 13.
4
Applying the Fisher score to identify Alzheimer's disease-related genes.应用费舍尔评分法识别阿尔茨海默病相关基因。
Genet Mol Res. 2016 Jun 27;15(2):gmr8798. doi: 10.4238/gmr.15028798.
5
A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages.一种能够对多代谱系进行单细胞RNA测序的微流控平台。
Nat Commun. 2016 Jan 6;7:10220. doi: 10.1038/ncomms10220.
6
Machine learning applications in cancer prognosis and prediction.机器学习在癌症预后和预测中的应用。
Comput Struct Biotechnol J. 2014 Nov 15;13:8-17. doi: 10.1016/j.csbj.2014.11.005. eCollection 2015.
7
Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells.单细胞RNA测序可识别胰腺循环肿瘤细胞的细胞外基质基因表达。
Cell Rep. 2014 Sep 25;8(6):1905-1918. doi: 10.1016/j.celrep.2014.08.029. Epub 2014 Sep 18.
8
Cardiovascular genomics: a biomarker identification pipeline.心血管基因组学:一种生物标志物识别流程。
IEEE Trans Inf Technol Biomed. 2012 Sep;16(5):809-22. doi: 10.1109/TITB.2012.2199570. Epub 2012 May 16.
9
Genetic networks and soft computing.遗传网络与软计算。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):94-107. doi: 10.1109/TCBB.2009.39.
10
Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance.应用于简化径向基函数(RBF)网络结构和提高分类性能的数据降维
IEEE Trans Syst Man Cybern B Cybern. 2003;33(3):399-409. doi: 10.1109/TSMCB.2003.810911.