Culos Anthony, Tsai Amy S, Stanley Natalie, Becker Martin, Ghaemi Mohammad S, McIlwain David R, Fallahzadeh Ramin, Tanada Athena, Nassar Huda, Espinosa Camilo, Xenochristou Maria, Ganio Edward, Peterson Laura, Han Xiaoyuan, Stelzer Ina A, Ando Kazuo, Gaudilliere Dyani, Phongpreecha Thanaphong, Marić Ivana, Chang Alan L, Shaw Gary M, Stevenson David K, Bendall Sean, Davis Kara L, Fantl Wendy, Nolan Garry P, Hastie Trevor, Tibshirani Robert, Angst Martin S, Gaudilliere Brice, Aghaeepour Nima
Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, USA.
Department of Biomedical Data Sciences, Stanford University, Stanford, CA, USA.
Nat Mach Intell. 2020 Oct;2(10):619-628. doi: 10.1038/s42256-020-00232-8. Epub 2020 Oct 12.
The dense network of interconnected cellular signalling responses that are quantifiable in peripheral immune cells provides a wealth of actionable immunological insights. Although high-throughput single-cell profiling techniques, including polychromatic flow and mass cytometry, have matured to a point that enables detailed immune profiling of patients in numerous clinical settings, the limited cohort size and high dimensionality of data increase the possibility of false-positive discoveries and model overfitting. We introduce a generalizable machine learning platform, the immunological Elastic-Net (iEN), which incorporates immunological knowledge directly into the predictive models. Importantly, the algorithm maintains the exploratory nature of the high-dimensional dataset, allowing for the inclusion of immune features with strong predictive capabilities even if not consistent with prior knowledge. In three independent studies our method demonstrates improved predictions for clinically relevant outcomes from mass cytometry data generated from whole blood, as well as a large simulated dataset. The iEN is available under an open-source licence.
在外周免疫细胞中可量化的相互连接的细胞信号反应密集网络提供了丰富的可操作的免疫学见解。尽管包括多色流式细胞术和质谱细胞术在内的高通量单细胞分析技术已经成熟到能够在众多临床环境中对患者进行详细的免疫分析,但有限的队列规模和高维度数据增加了假阳性发现和模型过拟合的可能性。我们引入了一个可推广的机器学习平台——免疫弹性网络(iEN),它将免疫学知识直接纳入预测模型。重要的是,该算法保持了高维数据集的探索性质,即使与先验知识不一致,也允许纳入具有强大预测能力的免疫特征。在三项独立研究中,我们的方法证明了对来自全血生成的质谱细胞术数据以及一个大型模拟数据集的临床相关结果的预测得到了改善。iEN以开源许可提供。