Li Mulin Jun, Pan Zhicheng, Liu Zipeng, Wu Jiexing, Wang Panwen, Zhu Yun, Xu Feng, Xia Zhengyuan, Sham Pak Chung, Kocher Jean-Pierre A, Li Miaoxin, Liu Jun S, Wang Junwen
Department of Statistics, Harvard University, Cambridge, Boston, 02138-2901 MA, USA, Centre for Genomic Sciences.
Centre for Genomic Sciences, Department of Psychiatry.
Bioinformatics. 2016 Sep 15;32(18):2729-36. doi: 10.1093/bioinformatics/btw288. Epub 2016 Jun 6.
Prediction and prioritization of human non-coding regulatory variants is critical for understanding the regulatory mechanisms of disease pathogenesis and promoting personalized medicine. Existing tools utilize functional genomics data and evolutionary information to evaluate the pathogenicity or regulatory functions of non-coding variants. However, different algorithms lead to inconsistent and even conflicting predictions. Combining multiple methods may increase accuracy in regulatory variant prediction.
Here, we compiled an integrative resource for predictions from eight different tools on functional annotation of non-coding variants. We further developed a composite strategy to integrate multiple predictions and computed the composite likelihood of a given variant being regulatory variant. Benchmarked by multiple independent causal variants datasets, we demonstrated that our composite model significantly improves the prediction performance.
We implemented our model and scoring procedure as a tool, named PRVCS, which is freely available to academic and non-profit usage at http://jjwanglab.org/PRVCS CONTACT: wang.junwen@mayo.edu, jliu@stat.harvard.edu, or limx54@gmail.com
Supplementary data are available at Bioinformatics online.
预测和确定人类非编码调控变异对于理解疾病发病机制的调控机制以及推动个性化医疗至关重要。现有工具利用功能基因组学数据和进化信息来评估非编码变异的致病性或调控功能。然而,不同的算法会导致不一致甚至相互冲突的预测。结合多种方法可能会提高调控变异预测的准确性。
在此,我们汇编了一个综合资源,用于整合来自八个不同工具对非编码变异功能注释的预测。我们进一步开发了一种复合策略来整合多种预测,并计算给定变异作为调控变异的复合似然性。以多个独立的因果变异数据集为基准,我们证明我们的复合模型显著提高了预测性能。
我们将模型和评分程序实现为一个名为PRVCS的工具,可在http://jjwanglab.org/PRVCS上免费供学术和非盈利使用。联系方式:wang.junwen@mayo.edu,jliu@stat.harvard.edu,或limx54@gmail.com
补充数据可在《生物信息学》在线获取。