用于表达谱分类的新特征子集选择程序。

New feature subset selection procedures for classification of expression profiles.

作者信息

Bø Trond, Jonassen Inge

机构信息

Department of Informatics, University of Bergen, N-5020 Bergen, Norway.

出版信息

Genome Biol. 2002;3(4):RESEARCH0017. doi: 10.1186/gb-2002-3-4-research0017. Epub 2002 Mar 14.

DOI:10.1186/gb-2002-3-4-research0017

PMID:11983058

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC115205/

Abstract

BACKGROUND

Methods for extracting useful information from the datasets produced by microarray experiments are at present of much interest. Here we present new methods for finding gene sets that are well suited for distinguishing experiment classes, such as healthy versus diseased tissues. Our methods are based on evaluating genes in pairs and evaluating how well a pair in combination distinguishes two experiment classes. We tested the ability of our pair-based methods to select gene sets that generalize the differences between experiment classes and compared the performance relative to two standard methods. To assess the ability to generalize class differences, we studied how well the gene sets we select are suited for learning a classifier.

RESULTS

We show that the gene sets selected by our methods outperform the standard methods, in some cases by a large margin, in terms of cross-validation prediction accuracy of the learned classifier. We show that on two public datasets, accurate diagnoses can be made using only 15-30 genes. Our results have implications for how to select marker genes and how many gene measurements are needed for diagnostic purposes.

CONCLUSION

When looking for differential expression between experiment classes, it may not be sufficient to look at each gene in a separate universe. Evaluating combinations of genes reveals interesting information that will not be discovered otherwise. Our results show that class prediction can be improved by taking advantage of this extra information.

摘要

背景

目前，从微阵列实验产生的数据集中提取有用信息的方法备受关注。在此，我们提出了一些新方法，用于寻找非常适合区分实验类别（如健康组织与患病组织）的基因集。我们的方法基于对基因进行成对评估，并评估一对基因组合区分两个实验类别的能力。我们测试了基于成对的方法选择能够概括实验类别之间差异的基因集的能力，并将其性能与两种标准方法进行了比较。为了评估概括类别差异的能力，我们研究了我们选择的基因集在学习分类器方面的适用性。

结果

我们表明，就学习到的分类器的交叉验证预测准确性而言，我们的方法选择的基因集优于标准方法，在某些情况下优势明显。我们表明，在两个公共数据集上，仅使用15 - 30个基因就可以做出准确的诊断。我们的结果对于如何选择标记基因以及诊断需要进行多少基因测量具有启示意义。

结论

在寻找实验类别之间的差异表达时，单独考察每个基因可能并不足够。评估基因组合会揭示出用其他方式无法发现的有趣信息。我们的结果表明，利用这些额外信息可以提高类别预测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a55b/115205/d8805e93b8ce/gb-2002-3-4-research0017-1.jpg

相似文献

New feature subset selection procedures for classification of expression profiles.用于表达谱分类的新特征子集选择程序。

Genome Biol. 2002;3(4):RESEARCH0017. doi: 10.1186/gb-2002-3-4-research0017. Epub 2002 Mar 14.

Reliable classification of two-class cancer data using evolutionary algorithms.使用进化算法对两类癌症数据进行可靠分类。

Biosystems. 2003 Nov;72(1-2):111-29. doi: 10.1016/s0303-2647(03)00138-2.

Improving reliability of gene selection from microarray functional genomics data.提高从微阵列功能基因组学数据中进行基因选择的可靠性。

IEEE Trans Inf Technol Biomed. 2003 Sep;7(3):191-6. doi: 10.1109/titb.2003.816558.

A CART-based approach to discover emerging patterns in microarray data.一种基于CART的方法来发现微阵列数据中的新兴模式。

Bioinformatics. 2003 Dec 12;19(18):2465-72. doi: 10.1093/bioinformatics/btg361.

Evaluation of gene importance in microarray data based upon probability of selection.基于选择概率评估微阵列数据中的基因重要性。

BMC Bioinformatics. 2005 Mar 22;6:67. doi: 10.1186/1471-2105-6-67.

Incremental forward feature selection with application to microarray gene expression data.应用于微阵列基因表达数据的增量前向特征选择

J Biopharm Stat. 2008;18(5):827-40. doi: 10.1080/10543400802277868.

Reliable gene signatures for microarray classification: assessment of stability and performance.用于微阵列分类的可靠基因特征：稳定性和性能评估

Bioinformatics. 2006 Oct 1;22(19):2356-63. doi: 10.1093/bioinformatics/btl400. Epub 2006 Jul 31.

Dimension reduction for classification with gene expression microarray data.利用基因表达微阵列数据进行分类的降维方法。

Stat Appl Genet Mol Biol. 2006;5:Article6. doi: 10.2202/1544-6115.1147. Epub 2006 Feb 24.

AUCTSP: an improved biomarker gene pair class predictor.AUCTSP：一种改进的生物标志物基因对分类预测器。

BMC Bioinformatics. 2018 Jun 26;19(1):244. doi: 10.1186/s12859-018-2231-1.

Classification between normal and tumor tissues based on the pair-wise gene expression ratio.基于成对基因表达比率对正常组织和肿瘤组织进行分类。

BMC Cancer. 2004 Oct 7;4:72. doi: 10.1186/1471-2407-4-72.

引用本文的文献

Applications of gene pair methods in clinical research: advancing precision medicine.基因对方法在临床研究中的应用：推动精准医学发展。

Mol Biomed. 2025 Apr 9;6(1):22. doi: 10.1186/s43556-025-00263-w.

Effect of the p38 Mitogen-Activated Protein Kinase Signaling Cascade on Radiation Biodosimetry.p38 丝裂原活化蛋白激酶信号级联对辐射生物剂量测定的影响。

Radiat Res. 2022 Jul 1;198(1):18-27. doi: 10.1667/RADE-21-00240.1.

Impact of aging on gene expression response to x-ray irradiation using mouse blood.利用小鼠血液研究衰老对 X 射线照射后基因表达的影响。

Sci Rep. 2021 May 13;11(1):10177. doi: 10.1038/s41598-021-89682-7.

Beta Distribution-Based Cross-Entropy for Feature Selection.基于贝塔分布的交叉熵用于特征选择。

Entropy (Basel). 2019 Aug 7;21(8):769. doi: 10.3390/e21080769.

Impact of inflammatory signaling on radiation biodosimetry: mouse model of inflammatory bowel disease.炎症信号对辐射生物剂量学的影响：炎症性肠病的小鼠模型。

BMC Genomics. 2019 May 2;20(1):329. doi: 10.1186/s12864-019-5689-y.

A hybrid gene selection algorithm based on interaction information for microarray-based cancer classification.基于互信息的混合基因选择算法在基于微阵列的癌症分类中的应用。

PLoS One. 2019 Feb 15;14(2):e0212333. doi: 10.1371/journal.pone.0212333. eCollection 2019.

Global Gene Expression Response in Mouse Models of DNA Repair Deficiency after Gamma Irradiation.γ 射线辐射后 DNA 修复缺陷小鼠模型中的全球基因表达反应。

Radiat Res. 2018 Apr;189(4):337-344. doi: 10.1667/RR14862.1. Epub 2018 Jan 19.

Automatic Classification of Tremor Severity in Parkinson's Disease Using a Wearable Device.使用可穿戴设备对帕金森病震颤严重程度进行自动分类。

Sensors (Basel). 2017 Sep 9;17(9):2067. doi: 10.3390/s17092067.

Peculiar Genes Selection: A new features selection method to improve classification performances in imbalanced data sets.奇特基因选择：一种用于提高不平衡数据集中分类性能的新特征选择方法。

PLoS One. 2017 Aug 14;12(8):e0177475. doi: 10.1371/journal.pone.0177475. eCollection 2017.

Immune-Signatures for Lung Cancer Diagnostics: Evaluation of Protein Microarray Data Normalization Strategies.用于肺癌诊断的免疫特征：蛋白质微阵列数据标准化策略的评估

Microarrays (Basel). 2015 Apr 2;4(2):162-87. doi: 10.3390/microarrays4020162.

本文引用的文献

Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks.利用基因表达谱和人工神经网络进行癌症的分类与诊断预测。

Nat Med. 2001 Jun;7(6):673-9. doi: 10.1038/89044.

J-Express: exploring gene expression data using Java.J-Express：使用Java探索基因表达数据。

Bioinformatics. 2001 Apr;17(4):369-70. doi: 10.1093/bioinformatics/17.4.369.

Identifying marker genes in transcription profiling data using a mixture of feature relevance experts.使用特征相关性专家混合方法在转录谱数据中识别标记基因。

Physiol Genomics. 2001 Mar 8;5(2):99-111. doi: 10.1152/physiolgenomics.2001.5.2.99.

Computational methods for gene expression-based tumor classification.基于基因表达的肿瘤分类的计算方法。

Biotechniques. 2000 Dec;29(6):1264-8, 1270. doi: 10.2144/00296bc02.

Tissue classification with gene expression profiles.基于基因表达谱的组织分类

J Comput Biol. 2000;7(3-4):559-83. doi: 10.1089/106652700750050943.

Coupled two-way clustering analysis of gene microarray data.基因芯片数据的耦合双向聚类分析

Proc Natl Acad Sci U S A. 2000 Oct 24;97(22):12079-84. doi: 10.1073/pnas.210134797.

Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.通过基因表达谱鉴定出的不同类型弥漫性大B细胞淋巴瘤。

Nature. 2000 Feb 3;403(6769):503-11. doi: 10.1038/35000501.

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.癌症的分子分类：通过基因表达监测进行类别发现和类别预测。

Science. 1999 Oct 15;286(5439):531-7. doi: 10.1126/science.286.5439.531.

Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.通过寡核苷酸阵列探测的肿瘤和正常结肠组织的聚类分析所揭示的基因表达广泛模式。

Proc Natl Acad Sci U S A. 1999 Jun 8;96(12):6745-50. doi: 10.1073/pnas.96.12.6745.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于表达谱分类的新特征子集选择程序。

New feature subset selection procedures for classification of expression profiles.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献