Suppr超能文献

在混合样本设计中寻找生物标志物特征:方法学比较的模拟框架

Finding biomarker signatures in pooled sample designs: a simulation framework for methodological comparisons.

作者信息

Telaar Anna, Nürnberg Gerd, Repsilber Dirk

机构信息

Genetics and Biometry, Leibniz Institute for Farm Animal Biology, Wilhelm-Stahl-Allee 2, D-18196 Dummerstorf, Germany.

出版信息

Adv Bioinformatics. 2010;2010:318573. doi: 10.1155/2010/318573. Epub 2010 Jul 4.

Abstract

Detection of discriminating patterns in gene expression data can be accomplished by using various methods of statistical learning. It has been proposed that sample pooling in this context would have negative effects; however, pooling cannot always be avoided. We propose a simulation framework to explicitly investigate the parameters of patterns, experimental design, noise, and choice of method in order to find out which effects on classification performance are to be expected. We use a two-group classification task and simulated gene expression data with independent differentially expressed genes as well as bivariate linear patterns and the combination of both. Our results show a clear increase of prediction error with pool size. For pooled training sets powered partial least squares discriminant analysis outperforms discriminance analysis, random forests, and support vector machines with linear or radial kernel for two of three simulated scenarios. The proposed simulation approach can be implemented to systematically investigate a number of additional scenarios of practical interest.

摘要

通过使用各种统计学习方法,可以实现对基因表达数据中鉴别模式的检测。有人提出,在这种情况下样本合并会产生负面影响;然而,合并并非总是可以避免的。我们提出了一个模拟框架,以明确研究模式参数、实验设计、噪声和方法选择,从而找出预期对分类性能有哪些影响。我们使用两组分类任务,并模拟了具有独立差异表达基因的基因表达数据,以及双变量线性模式和两者的组合。我们的结果表明,预测误差随着合并样本量的增加而明显增大。对于合并的训练集,在三个模拟场景中的两个场景下,有偏最小二乘判别分析优于判别分析、随机森林以及具有线性或径向核的支持向量机。所提出的模拟方法可以用于系统地研究许多其他具有实际意义的场景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b89/2909718/5624e97e2098/ABI2010-318573.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验