Suppr超能文献

人类单细胞数据机器学习模型中的偏差。

Biases in machine-learning models of human single-cell data.

作者信息

Willem Theresa, Shitov Vladimir A, Luecken Malte D, Kilbertus Niki, Bauer Stefan, Piraud Marie, Buyx Alena, Theis Fabian J

机构信息

TUM School for Medicine and Health, Institute of History and Ethics in Medicine, Technical University of Munich, Munich, Germany.

Helmholtz Munich, Munich, Germany.

出版信息

Nat Cell Biol. 2025 Mar;27(3):384-392. doi: 10.1038/s41556-025-01619-8. Epub 2025 Feb 19.

Abstract

Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.

摘要

近期基于机器学习(ML)的单细胞数据科学进展已能够在单细胞分辨率下对人类组织供体进行分层,有望提供有价值的诊断和预后见解。然而,这些见解容易受到偏差的影响。在这里,我们讨论了基于ML的单细胞分析流程中出现的各种偏差,从影响样本收集对象的社会偏差,到影响单细胞数据集通用性的临床和队列偏差,单细胞测序产生的偏差,在人类单细胞样本上训练的(弱监督或无监督)ML模型特有的ML偏差,以及ML模型结果解释过程中的偏差。我们最后提供了单细胞数据科学家评估和减轻偏差的方法,并呼吁努力解决偏差的根本原因。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验