Tang Zheng-Zheng, Bunn Paul, Tao Ran, Liu Zhouwen, Lin Dan-Yu
Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, 37203, TN, USA.
Department of Biostatistics, University of North Carolina, Chapel Hill, 37203, NC, USA.
BMC Genomics. 2017 Feb 14;18(1):160. doi: 10.1186/s12864-017-3573-1.
Meta-analysis is essential to the discovery of rare variants that influence complex diseases and traits. Four major software packages, namely MASS, MetaSKAT, RAREMETAL, and seqMeta, have been developed to perform meta-analysis of rare-variant associations. These packages first generate summary statistics for each study and then perform the meta-analysis by combining the summary statistics. Because of incompatible file formats and non-equivalent summary statistics, the output files from the study-level analysis of one package cannot be directly used to perform meta-analysis in another package.
We developed a computationally efficient software program, PreMeta, to resolve the non-compatibility of the four software packages and to facilitate meta-analysis of large-scale sequencing studies in a consortium setting. PreMeta reformats the output files of study-level summary statistics generated by the four packages (text files produced by MASS and RAREMETAL, binary files produced by MetaSKAT, and R data files produced by seqMeta) and translates the summary statistics from one form to another, such that the summary statistics from any package can be used to perform meta-analysis in any other package. With this tool, consortium members are not required to use the same software for study-level analyses. In addition, PreMeta checks for allele mismatches, corrects summary statistics, and allows the rescaled inverse normal transformation to be performed at the meta-analysis stage by rescaling summary statistics.
PreMeta processes summary statistics from the four packages to make them compatible and avoids the need to redo study-level analyses. PreMeta documentation and executable are available at: http://dlin.web.unc.edu/software/premeta .
荟萃分析对于发现影响复杂疾病和性状的罕见变异至关重要。已经开发了四个主要软件包,即MASS、MetaSKAT、RAREMETAL和seqMeta,用于进行罕见变异关联的荟萃分析。这些软件包首先为每个研究生成汇总统计量,然后通过合并汇总统计量来进行荟萃分析。由于文件格式不兼容和汇总统计量不等效,一个软件包在研究水平分析中输出的文件不能直接用于另一个软件包进行荟萃分析。
我们开发了一个计算效率高的软件程序PreMeta,以解决这四个软件包的不兼容性问题,并便于在联合研究中对大规模测序研究进行荟萃分析。PreMeta对这四个软件包生成的研究水平汇总统计量的输出文件进行重新格式化(MASS和RAREMETAL生成的文本文件、MetaSKAT生成的二进制文件以及seqMeta生成的R数据文件),并将汇总统计量从一种形式转换为另一种形式,这样任何一个软件包的汇总统计量都可用于在任何其他软件包中进行荟萃分析。有了这个工具,联合研究的成员无需在研究水平分析中使用相同的软件。此外,PreMeta会检查等位基因错配情况,校正汇总统计量,并通过重新调整汇总统计量,允许在荟萃分析阶段进行重新缩放的逆正态变换。
PreMeta处理这四个软件包的汇总统计量,使其兼容,避免了重新进行研究水平分析的需要。PreMeta的文档和可执行文件可在以下网址获取:http://dlin.web.unc.edu/software/premeta 。