Akgün Mete, Faruk Gerdan Ö, Görmez Zeliha, Demirci Hüseyin
Advanced Genomics and Bioinformatics Research Center (İGBAM), Informatics and Information Security Research Center (BİLGEM), The Scientific and Technological Research Council of Turkey (TÜBİTAK), 41470 Gebze, Kocaeli, Turkey.
J Biomed Inform. 2016 Apr;60:319-27. doi: 10.1016/j.jbi.2016.02.013. Epub 2016 Feb 27.
The availability of whole exome and genome sequencing has completely changed the structure of genetic disease studies. It is now possible to solve the disease causing mechanisms within shorter time and budgets. For this reason, mining out the valuable information from the huge amount of data produced by next generation techniques becomes a challenging task. Current tools analyze sequencing data in various methods. However, there is still need for fast, easy to use and efficacious tools. Considering genetic disease studies, there is a lack of publicly available tools which support compound heterozygous and de novo models. Also, existing tools either require advanced IT expertise or are inefficient for handling large variant files. In this work, we provide FMFilter, an efficient sieving tool for next generation sequencing data produced by genetic disease studies. We develop a software which allows to choose the inheritance model (recessive, dominant, compound heterozygous and de novo), the affected and control individuals. The program provides a user friendly Graphical User Interface which eliminates the requirement of advanced computer techniques. It has various filtering options which enable to eliminate the majority of the false alarms. FMFilter requires negligible memory, therefore it can easily handle very large variant files like multiple whole genomes with ordinary computers. We demonstrate the variant reduction capability and effectiveness of the proposed tool with public and in-house data for different inheritance models. We also compare FMFilter with the existing filtering software. We conclude that FMFilter provides an effective and easy to use environment for analyzing next generation sequencing data from Mendelian diseases.
全外显子组测序和全基因组测序技术的出现彻底改变了遗传疾病研究的格局。现在能够在更短的时间和预算内解析致病机制。因此,从新一代技术产生的海量数据中挖掘有价值的信息成为一项具有挑战性的任务。当前的工具采用各种方法分析测序数据。然而,仍然需要快速、易用且高效的工具。考虑到遗传疾病研究,缺乏支持复合杂合子和新生突变模型的公开可用工具。此外,现有工具要么需要先进的信息技术专业知识,要么处理大型变异文件时效率低下。在这项工作中,我们提供了FMFilter,这是一种用于遗传疾病研究产生的新一代测序数据的高效筛选工具。我们开发了一款软件,它允许选择遗传模式(隐性、显性、复合杂合子和新生突变)、患病个体和对照个体。该程序提供了一个用户友好的图形用户界面,无需先进的计算机技术。它有各种过滤选项,能够消除大多数误报。FMFilter所需内存可忽略不计,因此它可以用普通计算机轻松处理非常大的变异文件,如多个全基因组。我们用公开数据和内部数据针对不同遗传模式展示了该工具的变异减少能力和有效性。我们还将FMFilter与现有的过滤软件进行了比较。我们得出结论,FMFilter为分析孟德尔疾病的新一代测序数据提供了一个有效且易用的环境。