在语义层面实现全基因组关联研究数据库。

Semantically enabling a genome-wide association study database.

作者信息

Beck Tim, Free Robert C, Thorisson Gudmundur A, Brookes Anthony J

机构信息

Department of Genetics, University of Leicester, University Road, Leicester, UK.

出版信息

J Biomed Semantics. 2012 Dec 17;3(1):9. doi: 10.1186/2041-1480-3-9.

DOI:10.1186/2041-1480-3-9

PMID:23244533

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3579732/

Abstract

BACKGROUND

The amount of data generated from genome-wide association studies (GWAS) has grown rapidly, but considerations for GWAS phenotype data reuse and interchange have not kept pace. This impacts on the work of GWAS Central - a free and open access resource for the advanced querying and comparison of summary-level genetic association data. The benefits of employing ontologies for standardising and structuring data are widely accepted. The complex spectrum of observed human phenotypes (and traits), and the requirement for cross-species phenotype comparisons, calls for reflection on the most appropriate solution for the organisation of human phenotype data. The Semantic Web provides standards for the possibility of further integration of GWAS data and the ability to contribute to the web of Linked Data.

RESULTS

A pragmatic consideration when applying phenotype ontologies to GWAS data is the ability to retrieve all data, at the most granular level possible, from querying a single ontology graph. We found the Medical Subject Headings (MeSH) terminology suitable for describing all traits (diseases and medical signs and symptoms) at various levels of granularity and the Human Phenotype Ontology (HPO) most suitable for describing phenotypic abnormalities (medical signs and symptoms) at the most granular level. Diseases within MeSH are mapped to HPO to infer the phenotypic abnormalities associated with diseases. Building on the rich semantic phenotype annotation layer, we are able to make cross-species phenotype comparisons and publish a core subset of GWAS data as RDF nanopublications.

CONCLUSIONS

We present a methodology for applying phenotype annotations to a comprehensive genome-wide association dataset and for ensuring compatibility with the Semantic Web. The annotations are used to assist with cross-species genotype and phenotype comparisons. However, further processing and deconstructions of terms may be required to facilitate automatic phenotype comparisons. The provision of GWAS nanopublications enables a new dimension for exploring GWAS data, by way of intrinsic links to related data resources within the Linked Data web. The value of such annotation and integration will grow as more biomedical resources adopt the standards of the Semantic Web.

摘要

背景

全基因组关联研究（GWAS）产生的数据量增长迅速，但GWAS表型数据的再利用和交换方面的考量却未能跟上步伐。这影响了GWAS Central的工作，GWAS Central是一个免费的开放获取资源，用于对汇总级遗传关联数据进行高级查询和比较。采用本体对数据进行标准化和结构化的好处已得到广泛认可。观察到的人类表型（和性状）的复杂谱系以及跨物种表型比较的需求，促使人们思考组织人类表型数据的最合适解决方案。语义网为进一步整合GWAS数据以及为关联数据网络做出贡献提供了标准。

结果

将表型本体应用于GWAS数据时，一个实际的考量是能够从查询单个本体图中，以尽可能最细粒度的级别检索所有数据。我们发现医学主题词表（MeSH）术语适用于在不同粒度级别描述所有性状（疾病以及医学体征和症状），而人类表型本体（HPO）最适合在最细粒度级别描述表型异常（医学体征和症状）。MeSH中的疾病被映射到HPO，以推断与疾病相关的表型异常。基于丰富的语义表型注释层，我们能够进行跨物种表型比较，并将GWAS数据的核心子集发布为RDF纳米出版物。

结论

我们提出了一种方法，用于将表型注释应用于全面的全基因组关联数据集，并确保与语义网兼容。这些注释用于辅助跨物种基因型和表型比较。然而，可能需要对术语进行进一步处理和解构，以促进自动表型比较。通过与关联数据网络中的相关数据资源建立内在链接，提供GWAS纳米出版物为探索GWAS数据开辟了一个新维度。随着越来越多的生物医学资源采用语义网的标准，这种注释和整合的价值将会增加。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2c8/3579732/bdc79ca54804/2041-1480-3-9-1.jpg

相似文献

Semantically enabling a genome-wide association study database.

J Biomed Semantics. 2012 Dec 17;3(1):9. doi: 10.1186/2041-1480-3-9.

GWAS Central: a comprehensive resource for the discovery and comparison of genotype and phenotype data from genome-wide association studies.

Nucleic Acids Res. 2020 Jan 8;48(D1):D933-D940. doi: 10.1093/nar/gkz895.

Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis.

BMC Med Inform Decis Mak. 2012 Jul 31;12:78. doi: 10.1186/1472-6947-12-78.

HPO2Vec+: Leveraging heterogeneous knowledge resources to enrich node embeddings for the Human Phenotype Ontology.

J Biomed Inform. 2019 Aug;96:103246. doi: 10.1016/j.jbi.2019.103246. Epub 2019 Jun 27.

GWAS analyzer: integrating genotype, phenotype and public annotation data for genome-wide association study analysis.

Bioinformatics. 2010 Feb 15;26(4):560-4. doi: 10.1093/bioinformatics/btp714. Epub 2010 Jan 6.

AlzPharm: integration of neurodegeneration data using RDF.

BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S4. doi: 10.1186/1471-2105-8-S3-S4.

Applying semantic web technologies for phenome-wide scan using an electronic health record linked Biobank.

J Biomed Semantics. 2012 Dec 17;3(1):10. doi: 10.1186/2041-1480-3-10.

The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data.

Nucleic Acids Res. 2014 Jan;42(Database issue):D966-74. doi: 10.1093/nar/gkt1026. Epub 2013 Nov 11.

Practical application of ontologies to annotate and analyse large scale raw mouse phenotype data.

BMC Bioinformatics. 2009 May 6;10 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-10-S5-S2.

An open access database of genome-wide association results.

BMC Med Genet. 2009 Jan 22;10:6. doi: 10.1186/1471-2350-10-6.

引用本文的文献

GWAS Central: an expanding resource for finding and visualising genotype and phenotype data from genome-wide association studies.

Nucleic Acids Res. 2023 Jan 6;51(D1):D986-D993. doi: 10.1093/nar/gkac1017.

Interoperability between phenotypes in research and healthcare terminologies--Investigating partial mappings between HPO and SNOMED CT.

J Biomed Semantics. 2016 Feb 9;7:3. doi: 10.1186/s13326-016-0047-3. eCollection 2016.

Nanopublications for exposing experimental data in the life-sciences: a Huntington's Disease case study.

J Biomed Semantics. 2015 Feb 9;6:5. doi: 10.1186/2041-1480-6-5. eCollection 2015.

Extending the coverage of phenotypes in SNOMED CT through post-coordination.

Stud Health Technol Inform. 2015;216:795-9.

Integrated Bio-Search: challenges and trends for the integration, search and comprehensive processing of biological information.

BMC Bioinformatics. 2014;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-15-S1-S2. Epub 2014 Jan 10.

GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies.

Eur J Hum Genet. 2014 Jul;22(7):949-52. doi: 10.1038/ejhg.2013.274. Epub 2013 Dec 4.

本文引用的文献

Towards linked open gene mutations data.

BMC Bioinformatics. 2012 Mar 28;13 Suppl 4(Suppl 4):S7. doi: 10.1186/1471-2105-13-S4-S7.

Integration of global resources for human genetic variation and disease.

Hum Mutat. 2012 May;33(5):813-6. doi: 10.1002/humu.22079. Epub 2012 Apr 6.

Uberon, an integrative multi-species anatomy ontology.

Genome Biol. 2012 Jan 31;13(1):R5. doi: 10.1186/gb-2012-13-1-r5.

GWASdb: a database for human genetic variants identified by genome-wide association studies.

Nucleic Acids Res. 2012 Jan;40(Database issue):D1047-54. doi: 10.1093/nar/gkr1182. Epub 2011 Dec 1.

Large complex terminologies: more coding choice, but harder to find data--reflections on introduction of SNOMED CT (Systematized Nomenclature of Medicine--Clinical Terms) as an NHS standard.

Inform Prim Care. 2011;19(1):3-5. doi: 10.14236/jhi.v19i1.787.

Reorganizing the protein space at the Universal Protein Resource (UniProt).

Nucleic Acids Res. 2012 Jan;40(Database issue):D71-5. doi: 10.1093/nar/gkr981. Epub 2011 Nov 18.

DistiLD Database: diseases and traits in linkage disequilibrium blocks.

Nucleic Acids Res. 2012 Jan;40(Database issue):D1036-40. doi: 10.1093/nar/gkr899. Epub 2011 Nov 3.

The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation.

J Biomed Semantics. 2011 Oct 24;2(1):8. doi: 10.1186/2041-1480-2-8.

Deploying mutation impact text-mining software with the SADI Semantic Web Services framework.

BMC Bioinformatics. 2011;12 Suppl 4(Suppl 4):S6. doi: 10.1186/1471-2105-12-S4-S6. Epub 2011 Jul 5.

BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications.

Nucleic Acids Res. 2011 Jul;39(Web Server issue):W541-5. doi: 10.1093/nar/gkr469. Epub 2011 Jun 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在语义层面实现全基因组关联研究数据库。

Semantically enabling a genome-wide association study database.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献