Suppr超能文献

利用生物学知识提高生物标志物识别效率。

Improving the efficiency of biomarker identification using biological knowledge.

作者信息

Phan John H, Yin-Goen Qiqin, Young Andrew N, Wang May D

机构信息

Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, 313 Ferst Drive, Atlanta, GA 30332, USA.

出版信息

Pac Symp Biocomput. 2009:427-38.

Abstract

Identifying and validating biomarkers from high-throughput gene expression data is important for understanding and treating cancer. Typically, we identify candidate biomarkers as features that are differentially expressed between two or more classes of samples. Many feature selection metrics rely on ranking by some measure of differential expression. However, interpreting these results is difficult due to the large variety of existing algorithms and metrics, each of which may produce different results. Consequently, a feature ranking metric may work well on some datasets but perform considerably worse on others. We propose a method to choose an optimal feature ranking metric on an individual dataset basis. A metric is optimal if, for a particular dataset, it favorably ranks features that are known to be relevant biomarkers. Extensive knowledge of biomarker candidates is available in public databases and literature. Using this knowledge, we can choose a ranking metric that produces the most biologically meaningful results. In this paper, we first describe a framework for assessing the ability of a ranking metric to detect known relevant biomarkers. We then apply this method to clinical renal cancer microarray data to choose an optimal metric and identify several candidate biomarkers.

摘要

从高通量基因表达数据中识别和验证生物标志物对于理解和治疗癌症至关重要。通常,我们将候选生物标志物识别为在两类或多类样本之间差异表达的特征。许多特征选择指标依赖于通过某种差异表达度量进行排序。然而,由于现有算法和指标种类繁多,每种算法和指标可能产生不同的结果,因此解释这些结果很困难。因此,一个特征排序指标在某些数据集上可能表现良好,但在其他数据集上的表现可能会差很多。我们提出了一种基于单个数据集选择最优特征排序指标的方法。如果对于特定数据集,某个指标能对已知为相关生物标志物的特征进行有利排序,那么该指标就是最优的。在公共数据库和文献中可以获取关于候选生物标志物的广泛知识。利用这些知识,我们可以选择产生最具生物学意义结果的排序指标。在本文中,我们首先描述一个用于评估排序指标检测已知相关生物标志物能力的框架。然后我们将此方法应用于临床肾癌微阵列数据,以选择最优指标并识别几个候选生物标志物。

相似文献

2
9
Identification of Differentially Expressed Genes to Establish New Biomarker for Cancer Prediction.鉴定差异表达基因以建立癌症预测的新生物标志物。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Nov-Dec;16(6):1970-1985. doi: 10.1109/TCBB.2018.2837095. Epub 2018 May 16.

引用本文的文献

4
Cardiovascular genomics: a biomarker identification pipeline.心血管基因组学:一种生物标志物识别流程。
IEEE Trans Inf Technol Biomed. 2012 Sep;16(5):809-22. doi: 10.1109/TITB.2012.2199570. Epub 2012 May 16.

本文引用的文献

2
Towards knowledge-based gene expression data mining.迈向基于知识的基因表达数据挖掘。
J Biomed Inform. 2007 Dec;40(6):787-802. doi: 10.1016/j.jbi.2007.06.005. Epub 2007 Jun 21.
10
Gene signatures of progression and metastasis in renal cell cancer.肾细胞癌进展和转移的基因特征
Clin Cancer Res. 2005 Aug 15;11(16):5730-9. doi: 10.1158/1078-0432.CCR-04-2225.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验