Bioinformatics. 2019 Jul 15;35(14):2492-2494. doi: 10.1093/bioinformatics/bty993.
When analyzing sequence data, genetic variants are considered one by one, taking no account of whether or not they are found in the same individual. However, variant combinations might be key players in some diseases as variants that are neutral on their own can become deleterious when associated together. GEMPROT is a new analysis tool that allows, from a phased vcf file, to visualize the consequences of the genetic variants on the protein. At the level of an individual, the program shows the variants on each of the two protein sequences and the Pfam functional protein domains. When data on several individuals are available, GEMPROT lists the haplotypes found in the sample and can compare the haplotype distributions between different sub-groups of individuals. By offering a global visualization of the gene with the genetic variants present, GEMPROT makes it possible to better understand the impact of combinations of genetic variants on the protein sequence.
GEMPROT is freely available at https://github.com/TaniaCuppens/GEMPROT. An on-line version is also available at http://med-laennec.univ-brest.fr/GEMPROT/.
Supplementary data are available at Bioinformatics online.
在分析序列数据时,遗传变异通常逐个分析,而不考虑它们是否在同一个体中发现。然而,变异组合可能是某些疾病的关键因素,因为单独的中性变异在关联在一起时可能会变得有害。GEMPROT 是一种新的分析工具,它可以从相位 vcf 文件中可视化遗传变异对蛋白质的影响。在个体水平上,该程序显示每个蛋白质序列上的变异和 Pfam 功能蛋白域。当有多个个体的数据可用时,GEMPROT 列出样本中发现的单倍型,并可以比较不同个体亚组之间的单倍型分布。通过提供带有存在遗传变异的基因的全局可视化,GEMPROT 使得更好地理解遗传变异组合对蛋白质序列的影响成为可能。
GEMPROT 可在 https://github.com/TaniaCuppens/GEMPROT 上免费获得。也可以在 http://med-laennec.univ-brest.fr/GEMPROT/ 上获得在线版本。
补充数据可在生物信息学在线获得。