Institute of Social and Preventive Medicine, University Hospital of Lausanne, Lausanne, Switzerland Department of Computational Biology, University of Lausanne, Lausanne, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK.
Bioinformatics. 2016 Nov 1;32(21):3298-3305. doi: 10.1093/bioinformatics/btw477. Epub 2016 Jul 10.
Only a few large systematic studies have evaluated the impact of copy number variants (CNVs) on common diseases. Several million individuals have been genotyped on single nucleotide variation arrays, which could be used for genome-wide CNVs association studies. However, CNV calls remain prone to false positives and only empirical filtering strategies exist in the literature. To overcome this issue, we defined a new quality score (QS) estimating the probability of a CNV called by PennCNV to be confirmed by other software.
Out-of-sample comparison showed that the correlation between the consensus CNV status and the QS is twice as high as it is for any previously proposed CNV filters. ROC curves displayed an AUC higher than 0.8 and simulations showed an increase up to 20% in statistical power when using QS in comparison to other filtering strategies. Superior performance was confirmed also for alternative consensus CNV definition and through improving known CNV-trait associations.
http://goo.gl/T6yuFM CONTACT: zoltan.kutalik@unil.ch or aurelien@mace@unil.chSupplementary information: Supplementary data are available at Bioinformatics online.
只有少数几项大型系统性研究评估了拷贝数变异 (CNVs) 对常见疾病的影响。已有数百万人在单核苷酸变异阵列上进行了基因分型,这可用于全基因组 CNVs 关联研究。然而,CNV 调用仍然容易出现假阳性,文献中仅存在经验过滤策略。为了克服这个问题,我们定义了一个新的质量分数 (QS),用于估计 PennCNV 调用的 CNV 被其他软件确证的概率。
样本外比较表明,共识 CNV 状态与 QS 之间的相关性是任何先前提出的 CNV 过滤器的两倍。ROC 曲线显示 AUC 高于 0.8,并且模拟表明,与其他过滤策略相比,使用 QS 可将统计功效提高高达 20%。通过改进已知的 CNV-特征关联,以及使用替代的共识 CNV 定义,也证实了优越的性能。
zoltan.kutalik@unil.ch 或 aurelien@mace@unil.ch
补充数据可在 Bioinformatics 在线获取。