Arneson Adriana, Felsheim Brooke, Chien Jennifer, Ernst Jason
Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
NAR Genom Bioinform. 2020 Dec 17;2(4):lqaa104. doi: 10.1093/nargab/lqaa104. eCollection 2020 Dec.
ConsHMM is a method recently introduced to annotate genomes into conservation states, which are defined based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multi-species DNA sequence alignment. Previously, ConsHMM was only applied to a single genome for one multi-species sequence alignment. Here, we apply ConsHMM to produce 22 additional genome annotations covering human and seven other organisms for a variety of multi-species alignments. Additionally, we extend ConsHMM to generate allele-specific annotations, which we use to produce conservation state annotations for every possible single-nucleotide mutation in the human genome. Finally, we provide a web interface to interactively visualize parameters and annotation enrichments for ConsHMM models. These annotations and visualizations comprise the ConsHMM Atlas, which we expect will be a valuable resource for analyzing a variety of major genomes and genetic variation.
ConsHMM是一种最近引入的将基因组注释为保守状态的方法,这些保守状态是根据多物种DNA序列比对中哪些物种与参考基因组比对并匹配的组合和空间模式来定义的。以前,ConsHMM仅应用于单个基因组的一次多物种序列比对。在这里,我们应用ConsHMM为各种多物种比对生成另外22个涵盖人类和其他七种生物的基因组注释。此外,我们扩展了ConsHMM以生成等位基因特异性注释,我们用它来为人类基因组中每个可能的单核苷酸突变生成保守状态注释。最后,我们提供了一个网络界面,用于交互式可视化ConsHMM模型的参数和注释富集情况。这些注释和可视化构成了ConsHMM图谱,我们预计它将成为分析各种主要基因组和遗传变异的宝贵资源。