Vey Johannes, Kapsner Lorenz A., Fuchs Maximilian, Unberath Philipp, Veronesi Giulia, Kunz Meik
Functional Genomics and Systems Biology Group, Department of Bioinformatics, University of Würzburg, 97074 Würzburg, Germany.
Institute of Medical Biometry and Informatics, University of Heidelberg, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany.
Cancers (Basel). 2019 Oct 21;11(10):1606. doi: 10.3390/cancers11101606.
The identification of biomarker signatures is important for cancer diagnosis and prognosis. However, the detection of clinical reliable signatures is influenced by limited data availability, which may restrict statistical power. Moreover, methods for integration of large sample cohorts and signature identification are limited. We present a step-by-step computational protocol for functional gene expression analysis and the identification of diagnostic and prognostic signatures by combining meta-analysis with machine learning and survival analysis. The novelty of the toolbox lies in its all-in-one functionality, generic design, and modularity. It is exemplified for lung cancer, including a comprehensive evaluation using different validation strategies. However, the protocol is not restricted to specific disease types and can therefore be used by a broad community. The accompanying R package vignette runs in ~1 h and describes the workflow in detail for use by researchers with limited bioinformatics training.
生物标志物特征的识别对于癌症诊断和预后至关重要。然而,临床可靠特征的检测受到有限数据可用性的影响,这可能会限制统计效力。此外,整合大样本队列和特征识别的方法也很有限。我们提出了一种逐步计算方案,用于功能基因表达分析以及通过将荟萃分析与机器学习和生存分析相结合来识别诊断和预后特征。该工具箱的新颖之处在于其一体化功能、通用设计和模块化。以肺癌为例进行了说明,包括使用不同验证策略的综合评估。然而,该方案并不局限于特定疾病类型,因此可以被广泛的群体使用。随附的R包vignette运行时间约为1小时,并详细描述了工作流程,供生物信息学培训有限的研究人员使用。