Suppr超能文献

基于样本量利用DNA微阵列数据预测临床剂量时确定集成树的截断点

Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data.

作者信息

Yılmaz Isıkhan Selen, Karabulut Erdem, Alpar Celal Reha

机构信息

Vocational School of Social Sciences, Hacettepe University, Ankara, Turkey; Department of Biostatistics, Faculty of Medicine, Hacettepe University, Ankara, Turkey.

Department of Biostatistics, Faculty of Medicine, Hacettepe University, Ankara, Turkey.

出版信息

Comput Math Methods Med. 2016;2016:6794916. doi: 10.1155/2016/6794916. Epub 2016 Dec 20.

Abstract

. Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. . Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. . The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). . Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of = 25 as a cutoff point for RT bagging to outperform a single RT.

摘要

近年来,基于基因或临床数据评估剂量预测的成功率有了显著进展。本研究的目的是使用数据挖掘技术从DNA基因表达数据集中预测各种临床剂量值。纳入了11个包含剂量值的真实基因表达数据集。首先,使用迭代确定独立筛选法选择用于剂量预测的重要基因。然后,检验了回归树(RT)、支持向量回归(SVR)、RT装袋法、SVR装袋法和RT增强法的性能。结果表明,基于回归的特征选择方法显著减少了原始数据集中不相关基因的数量。总体而言,11个数据集中有9个使用SVR实现了最佳预测性能;第二准确的性能由梯度增强机(GBM)提供。基于微阵列基因表达数据对各种剂量值的分析确定了在我们的研究和参考文献中发现的共同基因。根据我们的研究结果,SVR和GBM可以很好地预测剂量-基因数据集。该研究的另一个结果是确定样本量n = 25作为RT装袋法优于单个RT的截止点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7127/5206477/bc1c366100ee/CMMM2016-6794916.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验