Suppr超能文献

使用半监督学习在有注释和无注释的微阵列数据集中发现生物标志物。

Biomarker discovery across annotated and unannotated microarray datasets using semi-supervised learning.

作者信息

Harris Cole, Ghaffari Noushin

机构信息

Exagen Diagnostics, Inc, Houston, TX, USA.

出版信息

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-9-S2-S7.

Abstract

The growing body of DNA microarray data has the potential to advance our understanding of the molecular basis of disease. However annotating microarray datasets with clinically useful information is not always possible, as this often requires access to detailed patient records. In this study we introduce GLAD, a new Semi-Supervised Learning (SSL) method for combining independent annotated datasets and unannotated datasets with the aim of identifying more robust sample classifiers. In our method, independent models are developed using subsets of genes for the annotated and unannotated datasets. These models are evaluated according to a scoring function that incorporates terms for classification accuracy on annotated data, and relative cluster separation in unannotated data. Improved models are iteratively generated using a genetic algorithm feature selection technique. Our results show that the addition of unannotated data into training, significantly improves classifier robustness.

摘要

越来越多的DNA微阵列数据有潜力促进我们对疾病分子基础的理解。然而,用临床有用信息注释微阵列数据集并非总是可行的,因为这通常需要获取详细的患者记录。在本研究中,我们引入了GLAD,这是一种新的半监督学习(SSL)方法,用于结合独立的注释数据集和未注释数据集,目的是识别更强大的样本分类器。在我们的方法中,使用注释和未注释数据集的基因子集开发独立模型。根据一个评分函数对这些模型进行评估,该评分函数包含注释数据上的分类准确性和未注释数据中的相对聚类分离项。使用遗传算法特征选择技术迭代生成改进模型。我们的结果表明,在训练中加入未注释数据可显著提高分类器的稳健性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca3c/2559897/a78cbf2d7ae6/1471-2164-9-S2-S7-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验