Suppr超能文献

确保高维分类任务中计算可重复性的纲要。

A compendium to ensure computational reproducibility in high-dimensional classification tasks.

作者信息

Ruschhaupt Markus, Huber Wolfgang, Poustka Annemarie, Mansmann Ulrich

机构信息

Division of Molecular Genome Analysis, German Cancer Research Centre.

出版信息

Stat Appl Genet Mol Biol. 2004;3:Article37. doi: 10.2202/1544-6115.1078. Epub 2004 Dec 19.

Abstract

We demonstrate a concept and implementation of a compendium for the classification of high-dimensional data from microarray gene expression profiles. A compendium is an interactive document that bundles primary data, statistical processing methods, figures, and derived data together with the textual documentation and conclusions. Interactivity allows the reader to modify and extend these components. We address the following questions: how much does the discriminatory power of a classifier depend on the choice of the algorithm that was used to identify it; what alternative classifiers could be used just as well; how robust is the result. The answers to these questions are essential prerequisites for validation and biological interpretation of the classifiers. We show how to use this approach by looking at these questions for a specific breast cancer microarray data set that first has been studied by Huang et al. (2003).

摘要

我们展示了一种用于对来自微阵列基因表达谱的高维数据进行分类的纲要的概念及实现。纲要是一种交互式文档,它将原始数据、统计处理方法、图表以及派生数据与文本记录和结论捆绑在一起。交互性使读者能够修改和扩展这些组件。我们探讨以下问题:分类器的判别能力在多大程度上取决于用于识别它的算法的选择;哪些替代分类器同样适用;结果的稳健性如何。这些问题的答案是分类器验证和生物学解释的重要前提。我们通过针对黄等人(2003年)首次研究的特定乳腺癌微阵列数据集审视这些问题,展示了如何使用这种方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验