• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scAI-SNP:一种从单细胞数据推断血统的方法。

scAI-SNP: a method for inferring ancestry from single-cell data.

作者信息

Hong Sung Chul, Muyas Francesc, Cortés-Ciriano Isidro, Hormoz Sahand

机构信息

Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215 USA.

European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD UK.

出版信息

BMC Methods. 2025;2(1):10. doi: 10.1186/s44330-025-00029-4. Epub 2025 May 19.

DOI:10.1186/s44330-025-00029-4
PMID:40401145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12089154/
Abstract

BACKGROUND

Collaborative efforts, such as the Human Cell Atlas, are rapidly accumulating large amounts of single-cell data. To ensure that single-cell atlases are representative of human genetic diversity, we need to determine the ancestry of the donors from whom single-cell data are generated. Self-reporting of race and ethnicity, although important, can be biased and is not always available for the datasets already collected.

METHODS

Here, we introduce scAI-SNP, a tool to infer ancestry directly from single-cell genomics data. To train scAI-SNP, we identified 4.5 million ancestry-informative single-nucleotide polymorphisms (SNPs) in the 1000 Genomes Project dataset across 3201 individuals from 26 population groups. For a query single-cell dataset, scAI-SNP uses these ancestry-informative SNPs to compute the contribution of each of the 26 population groups to the ancestry of the donor from whom the cells were obtained.

RESULTS

Using diverse single-cell datasets with matched whole-genome sequencing data, we show that scAI-SNP is robust to the sparsity of single-cell data, can accurately and consistently infer ancestry from samples derived from diverse types of tissues and cancer cells, and can be applied to different modalities of single-cell profiling assays, such as single-cell RNA-seq and single-cell ATAC-seq.

DISCUSSION

Finally, we argue that ensuring that single-cell atlases represent diverse ancestry, ideally alongside race and ethnicity, is ultimately important for improved and equitable health outcomes by accounting for human diversity.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1186/s44330-025-00029-4.

摘要

背景

诸如人类细胞图谱之类的合作项目正在迅速积累大量单细胞数据。为确保单细胞图谱能够代表人类遗传多样性,我们需要确定产生单细胞数据的供体的血统。种族和族裔的自我报告虽然很重要,但可能存在偏差,而且对于已经收集的数据集来说并不总是可用的。

方法

在此,我们介绍了scAI-SNP,一种直接从单细胞基因组学数据推断血统的工具。为了训练scAI-SNP,我们在千人基因组计划数据集中,从26个种群组的3201个个体中鉴定出450万个具有血统信息的单核苷酸多态性(SNP)。对于一个查询单细胞数据集,scAI-SNP使用这些具有血统信息的SNP来计算26个种群组中每一个对获得细胞的供体血统的贡献。

结果

使用具有匹配全基因组测序数据的各种单细胞数据集,我们表明scAI-SNP对单细胞数据的稀疏性具有鲁棒性,可以准确且一致地从源自不同类型组织和癌细胞的样本中推断血统,并且可以应用于单细胞分析检测的不同模式,如单细胞RNA测序和单细胞ATAC测序。

讨论

最后,我们认为确保单细胞图谱代表不同的血统,理想情况下同时考虑种族和族裔,对于通过考虑人类多样性来改善健康结果并实现公平最终是至关重要的。

补充信息

在线版本包含可在10.1186/s44330-025-00029-4获取的补充材料。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/8fb9b1b41de6/44330_2025_29_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/b99917117996/44330_2025_29_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/b4f8a171ae17/44330_2025_29_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/d74f9005c94e/44330_2025_29_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/1b7f93cf332b/44330_2025_29_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/4c995a8cabc4/44330_2025_29_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/8fb9b1b41de6/44330_2025_29_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/b99917117996/44330_2025_29_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/b4f8a171ae17/44330_2025_29_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/d74f9005c94e/44330_2025_29_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/1b7f93cf332b/44330_2025_29_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/4c995a8cabc4/44330_2025_29_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5d9/12089154/8fb9b1b41de6/44330_2025_29_Fig6_HTML.jpg

相似文献

1
scAI-SNP: a method for inferring ancestry from single-cell data.scAI-SNP:一种从单细胞数据推断血统的方法。
BMC Methods. 2025;2(1):10. doi: 10.1186/s44330-025-00029-4. Epub 2025 May 19.
2
scAI-SNP: a method for inferring ancestry from single-cell data.scAI-SNP:一种从单细胞数据推断血统的方法。
bioRxiv. 2024 May 17:2024.05.14.594208. doi: 10.1101/2024.05.14.594208.
3
An ancestry informative marker panel design for individual ancestry estimation of Hispanic population using whole exome sequencing data.基于全外显子组测序数据的西班牙裔个体祖籍信息标记面板设计用于个体祖籍估计。
BMC Genomics. 2019 Dec 30;20(Suppl 12):1007. doi: 10.1186/s12864-019-6333-6.
4
Evaluating genetic-ancestry inference from single-cell RNA-seq data.评估来自单细胞RNA测序数据的遗传血统推断
bioRxiv. 2025 Mar 28:2025.03.25.645175. doi: 10.1101/2025.03.25.645175.
5
Development and Evaluation of the Ancestry Informative Marker Panel of the VISAGE Basic Tool.发展和评估 VISAGE 基本工具的祖源信息标记面板。
Genes (Basel). 2021 Aug 22;12(8):1284. doi: 10.3390/genes12081284.
6
FastPop: a rapid principal component derived method to infer intercontinental ancestry using genetic data.FastPop:一种利用遗传数据推断洲际血统的快速主成分衍生方法。
BMC Bioinformatics. 2016 Mar 9;17:122. doi: 10.1186/s12859-016-0965-1.
7
Ancestry-informative marker (AIM) SNP panel for the Malay population.马来人群的祖先信息标记(AIM)单核苷酸多态性(SNP)面板。
Int J Legal Med. 2020 Jan;134(1):123-134. doi: 10.1007/s00414-019-02184-0. Epub 2019 Nov 23.
8
Pacifiplex: an ancestry-informative SNP panel centred on Australia and the Pacific region.Pacifiplex:一个以澳大利亚和太平洋地区为中心的具有祖先信息的单核苷酸多态性(SNP)面板。
Forensic Sci Int Genet. 2016 Jan;20:71-80. doi: 10.1016/j.fsigen.2015.10.003. Epub 2015 Oct 20.
9
Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.多个祖先信息标记面板之间的单核苷酸多态性(SNP)重叠最少,这表明需要更多的国际合作。
Forensic Sci Int Genet. 2016 Jul;23:25-32. doi: 10.1016/j.fsigen.2016.01.013. Epub 2016 Jan 22.
10
Hydrop enables droplet-based single-cell ATAC-seq and single-cell RNA-seq using dissolvable hydrogel beads.Hydrop 可利用可溶解水凝胶珠进行基于液滴的单细胞 ATAC-seq 和单细胞 RNA-seq。
Elife. 2022 Feb 23;11:e73971. doi: 10.7554/eLife.73971.

引用本文的文献

1
Evaluating genetic-ancestry inference from single-cell RNA-seq data.评估来自单细胞RNA测序数据的遗传血统推断
bioRxiv. 2025 Mar 28:2025.03.25.645175. doi: 10.1101/2025.03.25.645175.
2
scAI-SNP: a method for inferring ancestry from single-cell data.scAI-SNP:一种从单细胞数据推断血统的方法。
bioRxiv. 2024 May 17:2024.05.14.594208. doi: 10.1101/2024.05.14.594208.

本文引用的文献

1
Calibrated prediction intervals for polygenic scores across diverse contexts.在不同环境下对多基因评分进行校准预测区间。
Nat Genet. 2024 Jul;56(7):1386-1396. doi: 10.1038/s41588-024-01792-w. Epub 2024 Jun 17.
2
Single-nucleotide variant calling in single-cell sequencing data with Monopogen.利用 Monopogen 对单细胞测序数据进行单核苷酸变异 calling。
Nat Biotechnol. 2024 May;42(5):803-812. doi: 10.1038/s41587-023-01873-x. Epub 2023 Aug 17.
3
De novo detection of somatic mutations in high-throughput single-cell profiling data sets.
高通量单细胞分析数据集中原发性体细胞突变的检测。
Nat Biotechnol. 2024 May;42(5):758-767. doi: 10.1038/s41587-023-01863-z. Epub 2023 Jul 6.
4
Polygenic scoring accuracy varies across the genetic ancestry continuum.多基因评分准确性在遗传祖先连续体上有所差异。
Nature. 2023 Jun;618(7966):774-781. doi: 10.1038/s41586-023-06079-4. Epub 2023 May 17.
5
Ovarian cancer mutational processes drive site-specific immune evasion.卵巢癌突变过程驱动特定部位的免疫逃逸。
Nature. 2022 Dec;612(7941):778-786. doi: 10.1038/s41586-022-05496-1. Epub 2022 Dec 14.
6
High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios.对扩展的 1000 基因组项目队列进行高覆盖率全基因组测序,包括 602 个三核苷酸重复序列。
Cell. 2022 Sep 1;185(18):3426-3440.e19. doi: 10.1016/j.cell.2022.08.004.
7
Using the UK Biobank as a global reference of worldwide populations: application to measuring ancestry diversity from GWAS summary statistics.利用英国生物库作为全球人群的全球参考:从 GWAS 汇总统计数据衡量祖先多样性的应用。
Bioinformatics. 2022 Jun 27;38(13):3477-3480. doi: 10.1093/bioinformatics/btac348.
8
Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function.单细胞跨组织分子参考图谱,助力疾病基因功能研究。
Science. 2022 May 13;376(6594):eabl4290. doi: 10.1126/science.abl4290.
9
A single-cell atlas of chromatin accessibility in the human genome.人类基因组中单细胞核染色质可及性图谱
Cell. 2021 Nov 24;184(24):5985-6001.e19. doi: 10.1016/j.cell.2021.10.024. Epub 2021 Nov 12.
10
Summix: A method for detecting and adjusting for population structure in genetic summary data.Summix:一种用于检测和调整遗传汇总数据中群体结构的方法。
Am J Hum Genet. 2021 Jul 1;108(7):1270-1282. doi: 10.1016/j.ajhg.2021.05.016. Epub 2021 Jun 21.