Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
Department of Human Genetics, Stanford University, Stanford, CA 94305, USA.
Bioinformatics. 2020 Jan 15;36(2):364-372. doi: 10.1093/bioinformatics/btz612.
Genome-wide association studies have revealed that 88% of disease-associated single-nucleotide polymorphisms (SNPs) reside in noncoding regions. However, noncoding SNPs remain understudied, partly because they are challenging to prioritize for experimental validation. To address this deficiency, we developed the SNP effect matrix pipeline (SEMpl).
SEMpl estimates transcription factor-binding affinity by observing differences in chromatin immunoprecipitation followed by deep sequencing signal intensity for SNPs within functional transcription factor-binding sites (TFBSs) genome-wide. By cataloging the effects of every possible mutation within the TFBS motif, SEMpl can predict the consequences of SNPs to transcription factor binding. This knowledge can be used to identify potential disease-causing regulatory loci.
SEMpl is available from https://github.com/Boyle-Lab/SEM_CPP.
Supplementary data are available at Bioinformatics online.
全基因组关联研究表明,88%的与疾病相关的单核苷酸多态性 (SNP) 位于非编码区域。然而,非编码 SNP 的研究仍不够充分,部分原因是它们难以优先进行实验验证。为了解决这一不足,我们开发了 SNP 效应矩阵管道 (SEMpl)。
SEMpl 通过观察全基因组功能转录因子结合位点 (TFBS) 内 SNP 的染色质免疫沉淀后深度测序信号强度的差异来估计转录因子结合亲和力。通过对 TFBS 基序内每个可能突变的影响进行编目,SEMpl 可以预测 SNP 对转录因子结合的影响。这些知识可用于识别潜在的致病调控基因座。
SEMpl 可从 https://github.com/Boyle-Lab/SEM_CPP 获得。
补充数据可在“Bioinformatics”在线获得。