• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GA4GH 表型数据包语料库:用于基因组诊断和发现的病例级表型分析。

A corpus of GA4GH phenopackets: Case-level phenotyping for genomic diagnostics and discovery.

作者信息

Danis Daniel, Bamshad Michael J, Bridges Yasemin, Caballero-Oteyza Andrés, Cacheiro Pilar, Carmody Leigh C, Chimirri Leonardo, Chong Jessica X, Coleman Ben, Dalgleish Raymond, Freeman Peter J, Graefe Adam S L, Groza Tudor, Hansen Peter, Jacobsen Julius O B, Klocperk Adam, Kusters Maaike, Ladewig Markus S, Marcello Allison J, Mattina Teresa, Mungall Christopher J, Munoz-Torres Monica C, Reese Justin T, Rehburg Filip, Reis Bárbara C S, Schuetz Catharina, Smedley Damian, Strauss Timmy, Sundaramurthi Jagadish Chandrabose, Thun Sylvia, Wissink Kyran, Wagstaff John F, Zocche David, Haendel Melissa A, Robinson Peter N

机构信息

Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany; The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington CT 06032, USA.

Department of Pediatrics, Division of Genetic Medicine, University of Washington, 1959 NE Pacific Street, Box 357371, Seattle, WA 98195, USA; Brotman-Baty Institute for Precision Medicine, 1959 NE Pacific Street, Box 357657, Seattle, WA 98195, USA; Department of Pediatrics, Division of Genetic Medicine, Seattle Children's Hospital, Seattle, WA 98195, USA.

出版信息

HGG Adv. 2025 Jan 9;6(1):100371. doi: 10.1016/j.xhgg.2024.100371. Epub 2024 Oct 10.

DOI:10.1016/j.xhgg.2024.100371
PMID:39394689
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11564936/
Abstract

The Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema was released in 2022 and approved by ISO as a standard for sharing clinical and genomic information about an individual, including phenotypic descriptions, numerical measurements, genetic information, diagnoses, and treatments. A phenopacket can be used as an input file for software that supports phenotype-driven genomic diagnostics and for algorithms that facilitate patient classification and stratification for identifying new diseases and treatments. There has been a great need for a collection of phenopackets to test software pipelines and algorithms. Here, we present Phenopacket Store. Phenopacket Store v.0.1.19 includes 6,668 phenopackets representing 475 Mendelian and chromosomal diseases associated with 423 genes and 3,834 unique pathogenic alleles curated from 959 different publications. This represents the first large-scale collection of case-level, standardized phenotypic information derived from case reports in the literature with detailed descriptions of the clinical data and will be useful for many purposes, including the development and testing of software for prioritizing genes and diseases in diagnostic genomics, machine learning analysis of clinical phenotype data, patient stratification, and genotype-phenotype correlations. This corpus also provides best-practice examples for curating literature-derived data using the GA4GH Phenopacket Schema.

摘要

全球基因组与健康联盟(GA4GH)表型包模式于2022年发布,并被国际标准化组织(ISO)批准为共享个人临床和基因组信息的标准,包括表型描述、数值测量、遗传信息、诊断和治疗。一个表型包可以用作支持表型驱动的基因组诊断软件以及促进患者分类和分层以识别新疾病和治疗方法的算法的输入文件。非常需要一组表型包来测试软件管道和算法。在此,我们展示了表型包存储库。表型包存储库v.0.1.19包含6668个表型包,代表了与423个基因和3834个独特致病等位基因相关的475种孟德尔疾病和染色体疾病,这些数据是从959篇不同出版物中整理而来的。这是首个大规模的、源自文献病例报告的病例级标准化表型信息集合,其中包含临床数据的详细描述,将用于多种目的,包括开发和测试诊断基因组学中用于基因和疾病优先级排序的软件、临床表型数据的机器学习分析、患者分层以及基因型-表型相关性研究。该语料库还提供了使用GA4GH表型包模式整理文献衍生数据的最佳实践示例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0d6/11564936/e38b32d4c00d/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0d6/11564936/b74ea64fb4d8/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0d6/11564936/e38b32d4c00d/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0d6/11564936/b74ea64fb4d8/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0d6/11564936/e38b32d4c00d/gr2.jpg

相似文献

1
A corpus of GA4GH phenopackets: Case-level phenotyping for genomic diagnostics and discovery.GA4GH 表型数据包语料库:用于基因组诊断和发现的病例级表型分析。
HGG Adv. 2025 Jan 9;6(1):100371. doi: 10.1016/j.xhgg.2024.100371. Epub 2024 Oct 10.
2
A corpus of GA4GH Phenopackets: case-level phenotyping for genomic diagnostics and discovery.GA4GH 表型数据包语料库:用于基因组诊断和发现的病例级表型分析。
medRxiv. 2024 May 29:2024.05.29.24308104. doi: 10.1101/2024.05.29.24308104.
3
Phenopacket-tools: Building and validating GA4GH Phenopackets.Phenopacket-tools:构建和验证 GA4GH Phenopackets。
PLoS One. 2023 May 17;18(5):e0285433. doi: 10.1371/journal.pone.0285433. eCollection 2023.
4
Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation.将 OMOP CDM 转换为 phenopackets:模型对齐和患者数据表示评估。
J Biomed Inform. 2024 Jul;155:104659. doi: 10.1016/j.jbi.2024.104659. Epub 2024 May 21.
5
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
6
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
7
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
8
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
9
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
10
Short-Term Memory Impairment短期记忆障碍

引用本文的文献

1
PhenoDP: leveraging deep learning for phenotype-based case reporting, disease ranking, and symptom recommendation.PhenoDP:利用深度学习进行基于表型的病例报告、疾病排名和症状推荐。
Genome Med. 2025 Jun 6;17(1):67. doi: 10.1186/s13073-025-01496-8.
2
Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework.迈向用于表型驱动的变异体和基因优先级排序算法的标准基准:PhEval - 表型推断评估框架。
BMC Bioinformatics. 2025 Mar 22;26(1):87. doi: 10.1186/s12859-025-06105-4.
3
GA4GH Phenopacket-Driven Characterization of Genotype-Phenotype Correlations in Mendelian Disorders.

本文引用的文献

1
The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.2024 年的“君主计划”:一个整合跨物种表型、基因和疾病的分析平台。
Nucleic Acids Res. 2024 Jan 5;52(D1):D938-D949. doi: 10.1093/nar/gkad1082.
2
The Human Phenotype Ontology in 2024: phenotypes around the world.2024 年人类表型本体:世界各地的表型。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1333-D1346. doi: 10.1093/nar/gkad1005.
3
Phenopacket-tools: Building and validating GA4GH Phenopackets.Phenopacket-tools:构建和验证 GA4GH Phenopackets。
GA4GH孟德尔疾病中基因型-表型相关性的表型数据包驱动特征分析
medRxiv. 2025 Mar 6:2025.03.05.25323315. doi: 10.1101/2025.03.05.25323315.
4
Consistent Performance of GPT-4o in Rare Disease Diagnosis Across Nine Languages and 4967 Cases.GPT-4o在九种语言和4967个病例的罕见病诊断中表现一致。
medRxiv. 2025 Feb 28:2025.02.26.25322769. doi: 10.1101/2025.02.26.25322769.
5
An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets.一种基于本体的罕见病通用数据模型,用于协调国际登记处、FHIR和表型数据包。
Sci Data. 2025 Feb 8;12(1):234. doi: 10.1038/s41597-025-04558-z.
6
Pheno-Ranker: a toolkit for comparison of phenotypic data stored in GA4GH standards and beyond.Pheno-Ranker:用于比较存储在GA4GH标准及其他标准中的表型数据的工具包。
BMC Bioinformatics. 2024 Dec 4;25(1):373. doi: 10.1186/s12859-024-05993-2.
7
Leveraging clinical intuition to improve accuracy of phenotype-driven prioritization.利用临床直觉提高表型驱动优先级排序的准确性。
Genet Med. 2025 Jan;27(1):101292. doi: 10.1016/j.gim.2024.101292. Epub 2024 Oct 10.
8
Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools.系统基准测试表明,大语言模型尚未达到传统罕见病决策支持工具的诊断准确性。
medRxiv. 2024 Nov 7:2024.07.22.24310816. doi: 10.1101/2024.07.22.24310816.
9
Towards a standard benchmark for phenotype-driven variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework.迈向用于表型驱动的变异和基因优先级排序算法的标准基准:PhEval - 表型推断评估框架。
bioRxiv. 2025 Feb 20:2024.06.13.598672. doi: 10.1101/2024.06.13.598672.
PLoS One. 2023 May 17;18(5):e0285433. doi: 10.1371/journal.pone.0285433. eCollection 2023.
4
Enriching representation learning using 53 million patient notes through human phenotype ontology embedding.通过人类表型本体嵌入使用 5300 万患者笔记来丰富表示学习。
Artif Intell Med. 2023 May;139:102523. doi: 10.1016/j.artmed.2023.102523. Epub 2023 Feb 28.
5
GA4GH Phenopackets: A Practical Introduction.全球基因组与健康联盟(GA4GH)表型数据包:实用指南。
Adv Genet (Hoboken). 2022 Aug 25;4(1):2200016. doi: 10.1002/ggn2.202200016. eCollection 2023 Mar.
6
Development and application of a computable genotype model in the GA4GH Variation Representation Specification.GA4GH 变异表示规范中可计算基因型模型的开发与应用。
Pac Symp Biocomput. 2023;28:383-394.
7
PheNominal: an EHR-integrated web application for structured deep phenotyping at the point of care.PheNominal:一个在医疗护理点进行结构化深度表型分析的 EHR 集成型 Web 应用程序。
BMC Med Inform Decis Mak. 2022 Jul 28;22(Suppl 2):198. doi: 10.1186/s12911-022-01927-1.
8
The GA4GH Phenopacket schema defines a computable representation of clinical data.全球基因组与健康联盟(GA4GH)表型数据包模式定义了临床数据的可计算表示形式。
Nat Biotechnol. 2022 Jun;40(6):817-820. doi: 10.1038/s41587-022-01357-4.
9
SvAnna: efficient and accurate pathogenicity prediction of coding and regulatory structural variants in long-read genome sequencing.SvAnna:长读长测序中编码和调控结构变异的高效准确致病性预测。
Genome Med. 2022 Apr 28;14(1):44. doi: 10.1186/s13073-022-01046-6.
10
The GA4GH Variation Representation Specification: A computational framework for variation representation and federated identification.GA4GH变异表示规范:变异表示与联合识别的计算框架。
Cell Genom. 2021 Nov 10;1(2). doi: 10.1016/j.xgen.2021.100027.