Watkins Michael, Rynearson Shawn, Henrie Alex, Eilbeck Karen
Biomedical Informatics, 421 Wakara Way, University of Utah, Salt Lake City, Utah 84108.
AMIA Annu Symp Proc. 2020 Mar 4;2019:1226-1235. eCollection 2019.
Current methods used for representing biological sequence variants allow flexibility, which has created redundancy within variant archives and discordance among variant representation tools. While research methodologies have been able to adapt to this ambiguity, strict clinical standards make it difficult to use this data in what would otherwise be useful clinical interventions. We implemented a specification developed by the GA4GH Variant Modeling Collaboration (VMC), which details a new approach to unambiguous representation of variants at the allelic level, as a haplotype, or as a genotype. Our implementation, called the VMC Test Suite (http://vcfclin.org), offers web tools to generate and insert VMC identifiers into a VCF file and to generate a VMC bundle JSON representation of a VCF file or HGVS expression. A command line tool with similar functionality is also introduced. These tools facilitate use of this standard-an important step toward reliable querying of variants and their associated annotations.
当前用于表示生物序列变异的方法具有灵活性,这在变异档案中造成了冗余,并且在变异表示工具之间产生了不一致。虽然研究方法已经能够适应这种模糊性,但严格的临床标准使得在原本可能有用的临床干预中使用这些数据变得困难。我们实施了由全球基因组与健康联盟(GA4GH)变异建模协作组(VMC)开发的一项规范,该规范详细介绍了一种在等位基因水平、作为单倍型或作为基因型来明确表示变异的新方法。我们的实施版本称为VMC测试套件(http://vcfclin.org),提供了网络工具,可用于生成VMC标识符并将其插入到VCF文件中,以及生成VCF文件或HGVS表达式的VMC捆绑JSON表示。还引入了一个具有类似功能的命令行工具。这些工具促进了该标准的使用,这是朝着可靠查询变异及其相关注释迈出的重要一步。