• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类单细胞数据机器学习模型中的偏差。

Biases in machine-learning models of human single-cell data.

作者信息

Willem Theresa, Shitov Vladimir A, Luecken Malte D, Kilbertus Niki, Bauer Stefan, Piraud Marie, Buyx Alena, Theis Fabian J

机构信息

TUM School for Medicine and Health, Institute of History and Ethics in Medicine, Technical University of Munich, Munich, Germany.

Helmholtz Munich, Munich, Germany.

出版信息

Nat Cell Biol. 2025 Mar;27(3):384-392. doi: 10.1038/s41556-025-01619-8. Epub 2025 Feb 19.

DOI:10.1038/s41556-025-01619-8
PMID:39972066
Abstract

Recent machine-learning (ML)-based advances in single-cell data science have enabled the stratification of human tissue donors at single-cell resolution, promising to provide valuable diagnostic and prognostic insights. However, such insights are susceptible to biases. Here we discuss various biases that emerge along the pipeline of ML-based single-cell analysis, ranging from societal biases affecting whose samples are collected, to clinical and cohort biases that influence the generalizability of single-cell datasets, biases stemming from single-cell sequencing, ML biases specific to (weakly supervised or unsupervised) ML models trained on human single-cell samples and biases during the interpretation of results from ML models. We end by providing methods for single-cell data scientists to assess and mitigate biases, and call for efforts to address the root causes of biases.

摘要

近期基于机器学习(ML)的单细胞数据科学进展已能够在单细胞分辨率下对人类组织供体进行分层,有望提供有价值的诊断和预后见解。然而,这些见解容易受到偏差的影响。在这里,我们讨论了基于ML的单细胞分析流程中出现的各种偏差,从影响样本收集对象的社会偏差,到影响单细胞数据集通用性的临床和队列偏差,单细胞测序产生的偏差,在人类单细胞样本上训练的(弱监督或无监督)ML模型特有的ML偏差,以及ML模型结果解释过程中的偏差。我们最后提供了单细胞数据科学家评估和减轻偏差的方法,并呼吁努力解决偏差的根本原因。

相似文献

1
Biases in machine-learning models of human single-cell data.人类单细胞数据机器学习模型中的偏差。
Nat Cell Biol. 2025 Mar;27(3):384-392. doi: 10.1038/s41556-025-01619-8. Epub 2025 Feb 19.
2
Anthropogenic biases in chemical reaction data hinder exploratory inorganic synthesis.人为偏见在化学反应数据中阻碍了无机合成的探索。
Nature. 2019 Sep;573(7773):251-255. doi: 10.1038/s41586-019-1540-5. Epub 2019 Sep 11.
3
Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review.基于监督机器学习技术开发的预测模型研究中的偏倚风险:系统评价。
BMJ. 2021 Oct 20;375:n2281. doi: 10.1136/bmj.n2281.
4
Call for algorithmic fairness to mitigate amplification of racial biases in artificial intelligence models used in orthodontics and craniofacial health.呼吁算法公平性以减轻在口腔正畸学和颅面健康中使用的人工智能模型中种族偏见的放大。
Orthod Craniofac Res. 2023 Dec;26 Suppl 1:124-130. doi: 10.1111/ocr.12721. Epub 2023 Oct 17.
5
Beyond benchmarking and towards predictive models of dataset-specific single-cell RNA-seq pipeline performance.超越基准测试,迈向针对特定数据集的单细胞 RNA-seq 管道性能的预测模型。
Genome Biol. 2024 Jun 17;25(1):159. doi: 10.1186/s13059-024-03304-9.
6
Machine learning and statistical methods for clustering single-cell RNA-sequencing data.机器学习和统计方法在单细胞 RNA 测序数据分析中的应用。
Brief Bioinform. 2020 Jul 15;21(4):1209-1223. doi: 10.1093/bib/bbz063.
7
Integrating Deep Supervised, Self-Supervised and Unsupervised Learning for Single-Cell RNA-seq Clustering and Annotation.将深度监督学习、自监督学习和无监督学习相结合进行单细胞 RNA-seq 聚类和注释。
Genes (Basel). 2020 Jul 14;11(7):792. doi: 10.3390/genes11070792.
8
Transferable automatic hematological cell classification: Overcoming data limitations with self-supervised learning.可转移的自动血液细胞分类:通过自监督学习克服数据限制
Comput Methods Programs Biomed. 2025 Mar;260:108560. doi: 10.1016/j.cmpb.2024.108560. Epub 2024 Dec 9.
9
A beginner's guide to supervised analysis for mass cytometry data in cancer biology.癌症生物学中质谱流式细胞术数据的监督分析初学者指南。
Cytometry A. 2024 Dec;105(12):853-869. doi: 10.1002/cyto.a.24901. Epub 2024 Nov 1.
10
Integrating single cell analysis and machine learning methods reveals stem cell-related gene S100A10 as an important target for prediction of liver cancer diagnosis and immunotherapy.整合单细胞分析和机器学习方法揭示干细胞相关基因S100A10是预测肝癌诊断和免疫治疗的重要靶点。
Front Immunol. 2025 Jan 7;15:1534723. doi: 10.3389/fimmu.2024.1534723. eCollection 2024.

本文引用的文献

1
Visualizing scRNA-Seq data at population scale with GloScope.基于 GloScope 对单细胞 RNA-Seq 数据进行群体可视化分析。
Genome Biol. 2024 Oct 8;25(1):259. doi: 10.1186/s13059-024-03398-1.
2
Single-cell multiregion dissection of Alzheimer's disease.单细胞多区域剖析阿尔茨海默病。
Nature. 2024 Aug;632(8026):858-868. doi: 10.1038/s41586-024-07606-7. Epub 2024 Jul 24.
3
Challenges and best practices in omics benchmarking.组学基准测试中的挑战和最佳实践。
Nat Rev Genet. 2024 May;25(5):326-339. doi: 10.1038/s41576-023-00679-6. Epub 2024 Jan 12.
4
Detection of PatIent-Level distances from single cell genomics and pathomics data with Optimal Transport (PILOT).基于最优传输的单细胞基因组学和病理组学数据检测患者水平距离(PILOT)。
Mol Syst Biol. 2024 Feb;20(2):57-74. doi: 10.1038/s44320-023-00003-8. Epub 2023 Dec 19.
5
Population-level integration of single-cell datasets enables multi-scale analysis across samples.单细胞数据集的群体水平整合能够实现跨样本的多尺度分析。
Nat Methods. 2023 Nov;20(11):1683-1692. doi: 10.1038/s41592-023-02035-2. Epub 2023 Oct 9.
6
Promoting diagnostic equity: specifying genetic similarity rather than race or ethnicity.促进诊断公平:明确基因相似性而非种族或族裔。
J Med Ethics. 2023 Nov 23;49(12):820-821. doi: 10.1136/jme-2023-109449.
7
The specious art of single-cell genomics.单细胞基因组学的似是而非的艺术。
PLoS Comput Biol. 2023 Aug 17;19(8):e1011288. doi: 10.1371/journal.pcbi.1011288. eCollection 2023 Aug.
8
Identification of cell subpopulations associated with disease phenotypes from scRNA-seq data using PACSI.基于 scRNA-seq 数据使用 PACSI 鉴定与疾病表型相关的细胞亚群。
BMC Biol. 2023 Jul 19;21(1):159. doi: 10.1186/s12915-023-01658-3.
9
Addressing Ancestry and Sex Bias in Pharmacogenomics.解决药物基因组学中的种族和性别偏见问题。
Annu Rev Pharmacol Toxicol. 2024 Jan 23;64:53-64. doi: 10.1146/annurev-pharmtox-030823-111731. Epub 2023 Jul 14.
10
Comparison of transformations for single-cell RNA-seq data.单细胞 RNA-seq 数据转换方法比较。
Nat Methods. 2023 May;20(5):665-672. doi: 10.1038/s41592-023-01814-1. Epub 2023 Apr 10.