Zhao Huiying, Yang Yuedong, Lin Hai, Zhang Xinjun, Mort Matthew, Cooper David N, Liu Yunlong, Zhou Yaoqi
Genome Biol. 2013 Mar 13;14(3):R23. doi: 10.1186/gb-2013-14-3-r23.
Micro-indels (insertions or deletions shorter than 21 bps) constitute the second most frequent class of human gene mutation after single nucleotide variants. Despite the relative abundance of non-frameshifting indels, their damaging effect on protein structure and function has gone largely unstudied. We have developed a support vector machine-based method named DDIG-in (Detecting disease-causing genetic variations due to indels) to prioritize non-frameshifting indels by comparing disease-associated mutations with putatively neutral mutations from the 1,000 Genomes Project. The final model gives good discrimination for indels and is robust against annotation errors. A webserver implementing DDIG-in is available at http://sparks-lab.org/ddig.
微插入缺失(长度短于21个碱基对的插入或缺失)是仅次于单核苷酸变异的第二常见人类基因突变类型。尽管非移码插入缺失相对常见,但其对蛋白质结构和功能的破坏作用在很大程度上尚未得到研究。我们开发了一种基于支持向量机的方法,名为DDIG-in(检测由插入缺失导致的致病基因变异),通过将疾病相关突变与来自千人基因组计划的假定中性突变进行比较,对非移码插入缺失进行优先级排序。最终模型对插入缺失具有良好的区分能力,并且对注释错误具有鲁棒性。可通过http://sparks-lab.org/ddig访问实现DDIG-in的网络服务器。