Suppr超能文献

基因组序列变异标记语言(GSVML)。

Genomic Sequence Variation Markup Language (GSVML).

机构信息

Information Center for Medical Sciences, Tokyo Medical and Dental University, Bunkyo, Tokyo, Japan.

出版信息

Int J Med Inform. 2010 Feb;79(2):130-42. doi: 10.1016/j.ijmedinf.2009.11.003. Epub 2009 Dec 6.

Abstract

OBJECTIVE

With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics.

DESIGN AND METHOD

We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed.

RESULTS

We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated.

CONCLUSION

GSVML was developed as a potential data exchanging format for genomic sequence variation data exchange focusing on human health applications. The international standardization of GSVML is necessary, and is currently underway. GSVML can be applied to enhance the utilization of genomic sequence variation data worldwide by providing a communicable platform between clinical and research applications.

摘要

目的

为了充分利用目前由于基因组研究的爆炸式增长而迅速增加的国际上积累的基因组序列变异数据,有必要开发一种可互操作的数据交换格式及其国际标准化。基因组序列变异标记语言(GSVML)将专注于基因组序列变异数据和人类健康应用,如基于基因的医学或药物基因组学。

设计与方法

我们通过八个步骤开发了 GSVML,基于案例分析和领域调查。通过将设计范围集中在人类健康应用和基因组序列变异上,我们试图消除歧义并确保实用性。我们旨在满足基于人类临床基因组应用的用例分析得出的要求。基于数据库调查,我们试图最小化数据格式的冗余,同时最大限度地扩大数据覆盖范围。我们还试图确保与其他标记语言的通信和接口能力,以便在各种组学研究人员或设施之间交换组学数据。分析了与正在开发的临床标准(如健康水平 7 型基因信息模型)的接口能力。

结果

我们开发了面向人类健康的 GSVML,包含变异数据、直接注释和间接注释类别;变异数据类别是必需的,而直接和间接注释类别是可选的。注释类别包含组学和临床信息,并具有内部关系。在设计方面,我们检查了三个标准(作为人类健康应用)的 6 个案例和三个标准(作为基因组序列变异数据交换的数据格式)的 15 个数据元素。调查了五个国际 SNP 数据库和六个标记语言的数据格式,以及在 317 项方面与健康水平 7 型基因模型的接口能力。

结论

GSVML 已作为一种潜在的数据交换格式开发,用于专注于人类健康应用的基因组序列变异数据交换。GSVML 的国际标准化是必要的,目前正在进行中。GSVML 可以通过为临床和研究应用之间提供一个可交流的平台,来增强全球对基因组序列变异数据的利用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验