Harvard Medical School, Boston, MA 02115, USA.
Bioinformatics. 2011 Mar 15;27(6):891-3. doi: 10.1093/bioinformatics/btr029. Epub 2011 Jan 22.
Accurate annotations of genomic variants are necessary to achieve full-genome clinical interpretations that are scientifically sound and medically relevant. Many disease associations, especially those reported before the completion of the HGP, are limited in applicability because of potential inconsistencies with our current standards for genomic coordinates, nomenclature and gene structure. In an effort to validate and link variants from the medical genetics literature to an unambiguous reference for each variant, we developed a software pipeline and reviewed 68 641 single amino acid mutations from Online Mendelian Inheritance in Man (OMIM), Human Gene Mutation Database (HGMD) and dbSNP. The frequency of unresolved mutation annotations varied widely among the databases, ranging from 4 to 23%. A taxonomy of primary causes for unresolved mutations was produced.
This program is freely available from the web site (http://safegene.hms.harvard.edu/aa2nt/).
准确注释基因组变异对于实现科学合理且与医学相关的全基因组临床解释是必要的。许多疾病关联,特别是在 HGP 完成之前报告的那些,由于与我们当前的基因组坐标、命名法和基因结构标准不一致,其适用性受到限制。为了验证和将医学遗传学文献中的变体链接到每个变体的明确参考,我们开发了一个软件管道,并对来自在线孟德尔遗传(OMIM)、人类基因突变数据库(HGMD)和 dbSNP 的 68641 个单一氨基酸突变进行了审查。未解决突变注释的频率在数据库之间差异很大,范围从 4%到 23%。产生了未解决突变的主要原因分类法。