Tebel Katrin, Boldt Vivien, Steininger Anne, Port Matthias, Ebert Grit, Ullmann Reinhard
Max Planck Institute for Molecular Genetics, 14195, Berlin, Germany.
Department of Biology, Chemistry and Pharmacy, Free University Berlin, 14195, Berlin, Germany.
BMC Bioinformatics. 2017 Jan 6;18(1):19. doi: 10.1186/s12859-016-1430-x.
The analysis of DNA copy number variants (CNV) has increasing impact in the field of genetic diagnostics and research. However, the interpretation of CNV data derived from high resolution array CGH or NGS platforms is complicated by the considerable variability of the human genome. Therefore, tools for multidimensional data analysis and comparison of patient cohorts are needed to assist in the discrimination of clinically relevant CNVs from others.
We developed GenomeCAT, a standalone Java application for the analysis and integrative visualization of CNVs. GenomeCAT is composed of three modules dedicated to the inspection of single cases, comparative analysis of multidimensional data and group comparisons aiming at the identification of recurrent aberrations in patients sharing the same phenotype, respectively. Its flexible import options ease the comparative analysis of own results derived from microarray or NGS platforms with data from literature or public depositories. Multidimensional data obtained from different experiment types can be merged into a common data matrix to enable common visualization and analysis. All results are stored in the integrated MySQL database, but can also be exported as tab delimited files for further statistical calculations in external programs.
GenomeCAT offers a broad spectrum of visualization and analysis tools that assist in the evaluation of CNVs in the context of other experiment data and annotations. The use of GenomeCAT does not require any specialized computer skills. The various R packages implemented for data analysis are fully integrated into GenomeCATs graphical user interface and the installation process is supported by a wizard. The flexibility in terms of data import and export in combination with the ability to create a common data matrix makes the program also well suited as an interface between genomic data from heterogeneous sources and external software tools. Due to the modular architecture the functionality of GenomeCAT can be easily extended by further R packages or customized plug-ins to meet future requirements.
DNA拷贝数变异(CNV)分析在基因诊断和研究领域的影响日益增大。然而,由于人类基因组存在相当大的变异性,源自高分辨率阵列比较基因组杂交(array CGH)或二代测序(NGS)平台的CNV数据的解读变得复杂。因此,需要用于多维数据分析和患者队列比较的工具,以帮助区分临床相关的CNV与其他CNV。
我们开发了GenomeCAT,一个用于CNV分析和综合可视化的独立Java应用程序。GenomeCAT由三个模块组成,分别致力于单病例检查、多维数据的比较分析以及旨在识别具有相同表型患者中复发性畸变的组间比较。其灵活的导入选项便于将源自微阵列或NGS平台的自身结果与文献或公共数据库中的数据进行比较分析。从不同实验类型获得的多维数据可以合并到一个通用数据矩阵中,以实现通用的可视化和分析。所有结果都存储在集成的MySQL数据库中,但也可以导出为制表符分隔的文件,以便在外部程序中进行进一步的统计计算。
GenomeCAT提供了广泛的可视化和分析工具,有助于在其他实验数据和注释的背景下评估CNV。使用GenomeCAT不需要任何专业的计算机技能。为数据分析实现的各种R包完全集成到GenomeCAT的图形用户界面中,安装过程由向导支持。数据导入和导出的灵活性以及创建通用数据矩阵的能力使该程序也非常适合作为来自异构源的基因组数据与外部软件工具之间的接口。由于采用模块化架构,GenomeCAT的功能可以通过进一步的R包或定制插件轻松扩展,以满足未来的需求。