Hédou Julien, Marić Ivana, Bellan Grégoire, Einhaus Jakob, Gaudillière Dyani K, Ladant Francois-Xavier, Verdonk Franck, Stelzer Ina A, Feyaerts Dorien, Tsai Amy S, Ganio Edward A, Sabayev Maximilian, Gillard Joshua, Bonham Thomas A, Sato Masaki, Diop Maïgane, Angst Martin S, Stevenson David, Aghaeepour Nima, Montanari Andrea, Gaudillière Brice
Department of Anesthesiology, Perioperative & Pain Medicine, Stanford University, Stanford, CA.
Department of Pediatrics, Stanford University, Stanford, CA.
Res Sq. 2023 Feb 28:rs.3.rs-2609859. doi: 10.21203/rs.3.rs-2609859/v1.
High-content omic technologies coupled with sparsity-promoting regularization methods (SRM) have transformed the biomarker discovery process. However, the translation of computational results into a clinical use-case scenario remains challenging. A rate-limiting step is the rigorous selection of reliable biomarker candidates among a host of biological features included in multivariate models. We propose Stabl, a machine learning framework that unifies the biomarker discovery process with multivariate predictive modeling of clinical outcomes by selecting a sparse and reliable set of biomarkers. Evaluation of Stabl on synthetic datasets and four independent clinical studies demonstrates improved biomarker sparsity and reliability compared to commonly used SRMs at similar predictive performance. Stabl readily extends to double- and triple-omics integration tasks and identifies a sparser and more reliable set of biomarkers than those selected by state-of-the-art early- and late-fusion SRMs, thereby facilitating the biological interpretation and clinical translation of complex multi-omic predictive models. The complete package for Stabl is available online at https://github.com/gregbellan/Stabl.
高内涵组学技术与促进稀疏性的正则化方法(SRM)相结合,改变了生物标志物的发现过程。然而,将计算结果转化为临床用例场景仍然具有挑战性。一个限速步骤是在多变量模型中包含的众多生物学特征中严格选择可靠的生物标志物候选物。我们提出了Stabl,这是一个机器学习框架,通过选择一组稀疏且可靠的生物标志物,将生物标志物发现过程与临床结果的多变量预测建模统一起来。在合成数据集和四项独立临床研究上对Stabl进行评估,结果表明,与常用的SRM相比,在相似的预测性能下,Stabl提高了生物标志物的稀疏性和可靠性。Stabl很容易扩展到双组学和三组学整合任务,并且比最先进的早期和晚期融合SRM选择的生物标志物更稀疏、更可靠,从而促进了复杂多组学预测模型的生物学解释和临床转化。Stabl的完整软件包可在https://github.com/gregbellan/Stabl上在线获取。