• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单细胞数据与表型相结合可改善变异解读。

Single-cell data combined with phenotypes improves variant interpretation.

作者信息

Chapman Timothy, Lassmann Timo

机构信息

The Kids Research Institute Australia, 15 Hospital Ave, Nedlands, WA, 6009, Australia.

UWA Centre for Child Health Research, The University of Western Australia, 35 Stirling Hwy, Crawley, Western Autralia, 6009, Australia.

出版信息

BMC Genomics. 2025 May 28;26(1):540. doi: 10.1186/s12864-025-11711-w.

DOI:10.1186/s12864-025-11711-w
PMID:40437370
Abstract

BACKGROUND

Whole genome sequencing offers significant potential to improve the diagnosis and treatment of rare diseases by enabling the identification of thousands of rare, potentially pathogenic variants. Existing variant prioritisation tools can be complemented by approaches that incorporate phenotype specificity and provide contextual biological information, such as tissue or cell-type specificity. We hypothesised that integrating single-cell gene expression data into phenotype-specific models would improve the accuracy and interpretability of pathogenic variant prioritisation.

METHODS

To test this hypothesis, we developed IMPPROVE, a new tool that constructs phenotype-specific ensemble models integrating CADD scores with bulk and single-cell gene expression data. We constructed a total of 1,866 Random Forest models for individual HPO terms, incorporating both bulk and single cell expression data.

RESULTS

Our phenotype-specific models utilising expression data can better predict pathogenic variants in 90% of the phenotypes (HPO terms) considered. Using single-cell expression data instead of bulk benefited the models, significantly shifting the proportion of pathogenic variants that were correctly identified at a fixed false positive rate , using an approximate Wilcoxon signed rank test). We found 57 phenotypes' models exhibited a large performance difference, depending on the dataset used. Further analysis revealed biological links between the pathology and the tissues or cell-types used by these 57 models.

CONCLUSIONS

Phenotype-specific models that integrate gene expression data with CADD scores show great promise in improving variant prioritisation. In addition to improving diagnostic accuracy, these models offer insights into the underlying biological mechanisms of rare diseases. Enriching existing pathogenicity-related scores with gene expression datasets has the potential to advance personalised medicine through more accurate and interpretable variant prioritisation.

摘要

背景

全基因组测序通过能够识别数千种罕见的、潜在致病变异,为改善罕见病的诊断和治疗提供了巨大潜力。现有的变异优先级排序工具可以通过纳入表型特异性并提供上下文生物学信息(如组织或细胞类型特异性)的方法来补充。我们假设将单细胞基因表达数据整合到表型特异性模型中会提高致病变异优先级排序的准确性和可解释性。

方法

为了验证这一假设,我们开发了IMPPROVE,这是一种新工具,它构建了将CADD评分与批量和单细胞基因表达数据整合的表型特异性集成模型。我们为各个人类表型本体(HPO)术语共构建了1866个随机森林模型,纳入了批量和单细胞表达数据。

结果

我们利用表达数据的表型特异性模型能够在90%的所考虑表型(HPO术语)中更好地预测致病变异。使用单细胞表达数据而非批量数据对模型有益,在固定假阳性率下显著改变了正确识别的致病变异比例(使用近似威尔科克森符号秩检验)。我们发现57个表型的模型根据所使用的数据集表现出很大的性能差异。进一步分析揭示了这些57个模型所涉及的病理学与组织或细胞类型之间的生物学联系。

结论

将基因表达数据与CADD评分整合的表型特异性模型在改善变异优先级排序方面显示出巨大前景。除了提高诊断准确性外,这些模型还能深入了解罕见病的潜在生物学机制。用基因表达数据集丰富现有的致病性相关评分有潜力通过更准确和可解释的变异优先级排序推动个性化医疗。

相似文献

1
Single-cell data combined with phenotypes improves variant interpretation.单细胞数据与表型相结合可改善变异解读。
BMC Genomics. 2025 May 28;26(1):540. doi: 10.1186/s12864-025-11711-w.
2
Personalised analytics for rare disease diagnostics.个性化分析在罕见病诊断中的应用。
Nat Commun. 2019 Nov 21;10(1):5274. doi: 10.1038/s41467-019-13345-5.
3
MSeqDR Quick-Mitome (QM): Combining Phenotype-Guided Variant Interpretation and Machine Learning Classifiers to Aid Primary Mitochondrial Disease Genetic Diagnosis.MSeqDR 快速线粒体分析(QM):结合表型指导的变异解释和机器学习分类器辅助原发性线粒体疾病的遗传诊断。
Curr Protoc. 2024 Jan;4(1):e955. doi: 10.1002/cpz1.955.
4
A phenotype centric benchmark of variant prioritisation tools.变异优先级排序工具的以表型为中心的基准测试。
NPJ Genom Med. 2018 Feb 5;3:5. doi: 10.1038/s41525-018-0044-9. eCollection 2018.
5
A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics.一种用于全基因组诊断中临床变异优先级排序和疾病基因发现的可视化与策展方法。
Genome Med. 2016 Feb 2;8(1):13. doi: 10.1186/s13073-016-0261-8.
6
DECIPHER: Supporting the interpretation and sharing of rare disease phenotype-linked variant data to advance diagnosis and research.DECIPHER:支持解读和共享罕见病表型相关变异数据,以推动诊断和研究。
Hum Mutat. 2022 Jun;43(6):682-697. doi: 10.1002/humu.24340. Epub 2022 Feb 21.
7
CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores.使用深度学习衍生的剪接分数提高 CADD-Splice 全基因组变异效应预测。
Genome Med. 2021 Feb 22;13(1):31. doi: 10.1186/s13073-021-00835-9.
8
An expanded phenotype centric benchmark of variant prioritisation tools.一种基于扩展表型的变异优先级工具基准测试。
Hum Mutat. 2022 May;43(5):539-546. doi: 10.1002/humu.24362. Epub 2022 Mar 9.
9
Diagnosis of a Single-Nucleotide Variant in Whole-Exome Sequencing Data for Patients With Inherited Diseases: Machine Learning Study Using Artificial Intelligence Variant Prioritization.遗传性疾病患者全外显子测序数据中单核苷酸变异的诊断:使用人工智能变异优先级排序的机器学习研究
JMIR Bioinform Biotechnol. 2022 Sep 15;3(1):e37701. doi: 10.2196/37701.
10
GenePy - a score for estimating gene pathogenicity in individuals using next-generation sequencing data.GenePy - 一种使用下一代测序数据评估个体基因致病性的评分。
BMC Bioinformatics. 2019 May 16;20(1):254. doi: 10.1186/s12859-019-2877-3.

本文引用的文献

1
The Crosstalk between Nephropathy and Coagulation Disorder: Pathogenesis, Treatment, and Dilemmas.肾脏病与凝血障碍的相互作用:发病机制、治疗及困境。
J Am Soc Nephrol. 2023 Nov 1;34(11):1793-1811. doi: 10.1681/ASN.0000000000000199. Epub 2023 Jul 24.
2
The Impact of Artificial Intelligence in the Odyssey of Rare Diseases.人工智能在罕见病征程中的影响。
Biomedicines. 2023 Mar 13;11(3):887. doi: 10.3390/biomedicines11030887.
3
Dendritic Cells and Macrophages in the Pathogenesis of Psoriasis.树突细胞和巨噬细胞在银屑病发病机制中的作用。
Front Immunol. 2022 Jun 28;13:941071. doi: 10.3389/fimmu.2022.941071. eCollection 2022.
4
The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans.智慧人图谱:人类多器官单细胞转录组图谱。
Science. 2022 May 13;376(6594):eabl4896. doi: 10.1126/science.abl4896.
5
The national economic burden of rare disease in the United States in 2019.2019 年美国罕见病的国家经济负担。
Orphanet J Rare Dis. 2022 Apr 12;17(1):163. doi: 10.1186/s13023-022-02299-5.
6
Phenotype-driven approaches to enhance variant prioritization and diagnosis of rare disease.基于表型的方法提高罕见病变异的优先级和诊断。
Hum Mutat. 2022 Aug;43(8):1071-1081. doi: 10.1002/humu.24380. Epub 2022 Apr 27.
7
A guide for the diagnosis of rare and undiagnosed disease: beyond the exome.罕见病和不明原因疾病诊断指南:超越外显子组。
Genome Med. 2022 Feb 28;14(1):23. doi: 10.1186/s13073-022-01026-w.
8
MUON: multimodal omics analysis framework.MUON:多模态组学分析框架。
Genome Biol. 2022 Feb 1;23(1):42. doi: 10.1186/s13059-021-02577-8.
9
Variant interpretation using population databases: Lessons from gnomAD.使用人群数据库进行变异解释:来自 gnomAD 的经验。
Hum Mutat. 2022 Aug;43(8):1012-1030. doi: 10.1002/humu.24309. Epub 2021 Dec 16.
10
Cell type ontologies of the Human Cell Atlas.人类细胞图谱的细胞类型本体。
Nat Cell Biol. 2021 Nov;23(11):1129-1135. doi: 10.1038/s41556-021-00787-7. Epub 2021 Nov 8.