GERV：一种用于转录因子结合调控变异生成性评估的统计方法。

GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

作者信息

Zeng Haoyang, Hashimoto Tatsunori, Kang Daniel D, Gifford David K

机构信息

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and.

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and Department of Stem Cell and Regenerative Biology, Harvard University and Harvard Medical School, Cambridge, MA 02138, USA.

出版信息

Bioinformatics. 2016 Feb 15;32(4):490-6. doi: 10.1093/bioinformatics/btv565. Epub 2015 Oct 17.

DOI:10.1093/bioinformatics/btv565

PMID:26476779

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5860000/

Abstract

MOTIVATION

The majority of disease-associated variants identified in genome-wide association studies reside in noncoding regions of the genome with regulatory roles. Thus being able to interpret the functional consequence of a variant is essential for identifying causal variants in the analysis of genome-wide association studies.

RESULTS

We present GERV (generative evaluation of regulatory variants), a novel computational method for predicting regulatory variants that affect transcription factor binding. GERV learns a k-mer-based generative model of transcription factor binding from ChIP-seq and DNase-seq data, and scores variants by computing the change of predicted ChIP-seq reads between the reference and alternate allele. The k-mers learned by GERV capture more sequence determinants of transcription factor binding than a motif-based approach alone, including both a transcription factor's canonical motif and associated co-factor motifs. We show that GERV outperforms existing methods in predicting single-nucleotide polymorphisms associated with allele-specific binding. GERV correctly predicts a validated causal variant among linked single-nucleotide polymorphisms and prioritizes the variants previously reported to modulate the binding of FOXA1 in breast cancer cell lines. Thus, GERV provides a powerful approach for functionally annotating and prioritizing causal variants for experimental follow-up analysis.

AVAILABILITY AND IMPLEMENTATION

The implementation of GERV and related data are available at http://gerv.csail.mit.edu/.

摘要

动机

在全基因组关联研究中鉴定出的大多数疾病相关变异位于基因组的非编码区域，具有调控作用。因此，在全基因组关联研究分析中，能够解释变异的功能后果对于识别因果变异至关重要。

结果

我们提出了GERV（调控变异的生成性评估），这是一种预测影响转录因子结合的调控变异的新型计算方法。GERV从ChIP-seq和DNase-seq数据中学习基于k-mer的转录因子结合生成模型，并通过计算参考等位基因和替代等位基因之间预测的ChIP-seq读数变化对变异进行评分。与仅基于基序的方法相比，GERV学习的k-mer捕获了更多转录因子结合的序列决定因素，包括转录因子的典型基序和相关的辅助因子基序。我们表明，GERV在预测与等位基因特异性结合相关的单核苷酸多态性方面优于现有方法。GERV正确地预测了连锁单核苷酸多态性中一个经过验证的因果变异，并对先前报道的在乳腺癌细胞系中调节FOXA1结合的变异进行了优先级排序。因此，GERV为功能注释和对因果变异进行优先级排序以进行实验后续分析提供了一种强大的方法。

可用性和实现方式

GERV的实现及相关数据可在http://gerv.csail.mit.edu/获取。

相似文献

GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

Bioinformatics. 2016 Feb 15;32(4):490-6. doi: 10.1093/bioinformatics/btv565. Epub 2015 Oct 17.

On the identification of potential regulatory variants within genome wide association candidate SNP sets.

BMC Med Genomics. 2014 Jun 11;7:34. doi: 10.1186/1755-8794-7-34.

ABC: a tool to identify SNVs causing allele-specific transcription factor binding from ChIP-Seq experiments.

Bioinformatics. 2015 Sep 15;31(18):3057-9. doi: 10.1093/bioinformatics/btv321. Epub 2015 May 20.

SeqGL Identifies Context-Dependent Binding Signals in Genome-Wide Regulatory Element Maps.

PLoS Comput Biol. 2015 May 27;11(5):e1004271. doi: 10.1371/journal.pcbi.1004271. eCollection 2015 May.

Which Genetics Variants in DNase-Seq Footprints Are More Likely to Alter Binding?

PLoS Genet. 2016 Feb 22;12(2):e1005875. doi: 10.1371/journal.pgen.1005875. eCollection 2016 Feb.

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.

Bioinformatics. 2015 Jun 15;31(12):1889-96. doi: 10.1093/bioinformatics/btv094. Epub 2015 Feb 13.

atSNP: transcription factor binding affinity testing for regulatory SNP detection.

Bioinformatics. 2015 Oct 15;31(20):3353-5. doi: 10.1093/bioinformatics/btv328. Epub 2015 Jun 18.

SignalSpider: probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles.

Bioinformatics. 2015 Jan 1;31(1):17-24. doi: 10.1093/bioinformatics/btu604. Epub 2014 Sep 5.

De novo prediction of cis-regulatory elements and modules through integrative analysis of a large number of ChIP datasets.

BMC Genomics. 2014 Dec 2;15:1047. doi: 10.1186/1471-2164-15-1047.

Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans.

BMC Genomics. 2023 Oct 7;24(1):597. doi: 10.1186/s12864-023-09692-9.

引用本文的文献

OptimDase: An Algorithm for Predicting DNA Binding Sites with Combined Feature Encoding.

Interdiscip Sci. 2025 Jun 10. doi: 10.1007/s12539-025-00704-8.

A statistical approach for identifying single nucleotide variants that affect transcription factor binding.

iScience. 2024 Apr 18;27(5):109765. doi: 10.1016/j.isci.2024.109765. eCollection 2024 May 17.

Allele-specific binding (ASB) analyzer for annotation of allele-specific binding SNPs.

BMC Bioinformatics. 2023 Dec 8;24(1):464. doi: 10.1186/s12859-023-05604-6.

Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans.

BMC Genomics. 2023 Oct 7;24(1):597. doi: 10.1186/s12864-023-09692-9.

Identifying functional regulatory mutation blocks by integrating genome sequencing and transcriptome data.

iScience. 2023 Jul 3;26(8):107266. doi: 10.1016/j.isci.2023.107266. eCollection 2023 Aug 18.

Profiling the quantitative occupancy of myriad transcription factors across conditions by modeling chromatin accessibility data.

Genome Res. 2022 Jun;32(6):1183-1198. doi: 10.1101/gr.272203.120. Epub 2022 May 24.

A general framework for predicting the transcriptomic consequences of non-coding variation and small molecules.

PLoS Comput Biol. 2022 Apr 14;18(4):e1010028. doi: 10.1371/journal.pcbi.1010028. eCollection 2022 Apr.

Deep neural networks identify sequence context features predictive of transcription factor binding.

Nat Mach Intell. 2021 Feb;3(2):172-180. doi: 10.1038/s42256-020-00282-y. Epub 2021 Jan 18.

Motif-Raptor: a cell type-specific and transcription factor centric approach for post-GWAS prioritization of causal regulators.

Bioinformatics. 2021 Aug 9;37(15):2103-2111. doi: 10.1093/bioinformatics/btab072.

The impact of different negative training data on regulatory sequence predictions.

PLoS One. 2020 Dec 1;15(12):e0237412. doi: 10.1371/journal.pone.0237412. eCollection 2020.

本文引用的文献

atSNP: transcription factor binding affinity testing for regulatory SNP detection.

Bioinformatics. 2015 Oct 15;31(20):3353-5. doi: 10.1093/bioinformatics/btv328. Epub 2015 Jun 18.

A method to predict the impact of regulatory variants from DNA sequence.

Nat Genet. 2015 Aug;47(8):955-61. doi: 10.1038/ng.3331. Epub 2015 Jun 15.

Enhanced regulatory sequence prediction using gapped k-mer features.

PLoS Comput Biol. 2014 Jul 17;10(7):e1003711. doi: 10.1371/journal.pcbi.1003711. eCollection 2014 Jul.

Systematic functional regulatory assessment of disease-associated variants.

Proc Natl Acad Sci U S A. 2013 Jun 4;110(23):9607-12. doi: 10.1073/pnas.1219099110. Epub 2013 May 20.

Identification of functional cis-regulatory polymorphisms in the human genome.

Hum Mutat. 2013 May;34(5):735-42. doi: 10.1002/humu.22299. Epub 2013 Apr 5.

Interpreting noncoding genetic variation in complex traits and human disease.

Nat Biotechnol. 2012 Nov;30(11):1095-106. doi: 10.1038/nbt.2422. Epub 2012 Nov 8.

Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression.

Nat Genet. 2012 Nov;44(11):1191-8. doi: 10.1038/ng.2416. Epub 2012 Sep 23.

High resolution genome wide binding event finding and motif discovery reveals transcription factor spatial binding constraints.

PLoS Comput Biol. 2012;8(8):e1002638. doi: 10.1371/journal.pcbi.1002638. Epub 2012 Aug 9.

Large-scale computational identification of regulatory SNPs with rSNP-MAPPER.

BMC Genomics. 2012 Jun 18;13 Suppl 4(Suppl 4):S7. doi: 10.1186/1471-2164-13-S4-S7.

regSNPs: a strategy for prioritizing regulatory single nucleotide substitutions.

Bioinformatics. 2012 Jul 15;28(14):1879-86. doi: 10.1093/bioinformatics/bts275. Epub 2012 May 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

GERV：一种用于转录因子结合调控变异生成性评估的统计方法。

GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

作者信息

Zeng Haoyang, Hashimoto Tatsunori, Kang Daniel D, Gifford David K

机构信息

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and.

出版信息

Bioinformatics. 2016 Feb 15;32(4):490-6. doi: 10.1093/bioinformatics/btv565. Epub 2015 Oct 17.

DOI:10.1093/bioinformatics/btv565

PMID:26476779

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5860000/

Abstract

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

The implementation of GERV and related data are available at http://gerv.csail.mit.edu/.

GERV：一种用于转录因子结合调控变异生成性评估的统计方法。

GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现方式

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

GERV：一种用于转录因子结合调控变异生成性评估的统计方法。

GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现方式