Trakya University, Faculty of Medicine, Department of Biostatistics and Medical Informatics, 22030, Merkez, Edirne, Turkey; Turcosa Analytics Solutions Ltd Co, Erciyes Teknopark, 38039, Kayseri, Turkey.
Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100, Sihhiye, Ankara, Turkey; Turcosa Analytics Solutions Ltd Co, Erciyes Teknopark, 38039, Kayseri, Turkey.
Comput Biol Med. 2017 Oct 1;89:487-496. doi: 10.1016/j.compbiomed.2017.08.031. Epub 2017 Sep 5.
Survival analysis methods are often used in cancer studies. It has been shown that the combination of clinical data with genomics increases the predictive performance of survival analysis methods. But, this leads to a high-dimensional data problem. Fortunately, new methods have been developed in the last decade to overcome this problem. However, there is a strong need for easily accessible, user-friendly and interactive tool to perform survival analysis in the presence of genomics data. We developed an open-source and freely available web-based tool for survival analysis methods that can deal with high-dimensional data. This tool includes classical methods, such as Kaplan-Meier, Cox proportional hazards regression, and advanced methods, such as penalized Cox regression and Random Survival Forests. It also offers an optimal cutoff determination method based on maximizing several test statistics. The tool has a simple and interactive interface, and it can handle high dimensional data through feature selection and ensemble methods. To dichotomize gene expressions, geneSurv can identify optimal cutoff points. Users can upload their microarray, RNA-Seq, chip-Seq, proteomics, metabolomics or clinical data as a nxp dimensional data matrix, where n refers to samples and p refers to genes. This tool is available free at www.biosoft.hacettepe.edu.tr/geneSurv. All source code is available at https://github.com/selcukorkmaz/geneSurv under the GPL-3 license.
生存分析方法常用于癌症研究。研究表明,将临床数据与基因组学相结合可以提高生存分析方法的预测性能。但这会导致高维数据问题。幸运的是,在过去十年中已经开发出了新的方法来克服这个问题。然而,我们强烈需要一个易于访问、用户友好且具有交互性的工具,以便在存在基因组学数据的情况下进行生存分析。我们开发了一个开源的、免费的、基于网络的生存分析方法工具,它可以处理高维数据。该工具包括经典方法,如 Kaplan-Meier、Cox 比例风险回归和高级方法,如惩罚 Cox 回归和随机生存森林。它还提供了一种基于最大化多个检验统计量的最优截止值确定方法。该工具具有简单的交互界面,可以通过特征选择和集成方法处理高维数据。为了对基因表达进行二分类,geneSurv 可以识别最优截止点。用户可以上传他们的微阵列、RNA-Seq、芯片-Seq、蛋白质组学、代谢组学或临床数据作为一个 nxp 维数据矩阵,其中 n 表示样本,p 表示基因。该工具可在 www.biosoft.hacettepe.edu.tr/geneSurv 上免费使用。所有源代码可在 https://github.com/selcukorkmaz/geneSurv 上获得,遵循 GPL-3 许可证。