Suppr超能文献

基于覆盖度启发式算法的基因调控元件发现

Discovering Gene Regulatory Elements Using Coverage-Based Heuristics.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jul-Aug;15(4):1290-1300. doi: 10.1109/TCBB.2015.2496261. Epub 2015 Oct 30.

Abstract

Data mining algorithms and sequencing methods (such as RNA-seq and ChIP-seq) are being combined to discover genomic regulatory motifs that relate to a variety of phenotypes. However, motif discovery algorithms often produce very long lists of putative transcription factor binding sites, hindering the discovery of phenotype-related regulatory elements by making it difficult to select a manageable set of candidate motifs for experimental validation. To address this issue, the authors introduce the motif selection problem and provide coverage-based search heuristics for its solution. Analysis of 203 ChIP-seq experiments from the ENCyclopedia of DNA Elements project shows that our algorithms produce motifs that have high sensitivity and specificity and reveals new insights about the regulatory code of the human genome. The greedy algorithm performs the best, selecting a median of two motifs per ChIP-seq transcription factor group while achieving a median sensitivity of 77 percent.

摘要

数据挖掘算法和测序方法(如 RNA-seq 和 ChIP-seq)正在被结合使用,以发现与各种表型相关的基因组调控基序。然而,基序发现算法通常会产生非常长的潜在转录因子结合位点列表,这使得通过选择一组可管理的候选基序进行实验验证来发现与表型相关的调控元件变得非常困难。为了解决这个问题,作者引入了基序选择问题,并提供了基于覆盖的搜索启发式算法来解决它。对 ENCyclopedia of DNA Elements 项目中的 203 个 ChIP-seq 实验的分析表明,我们的算法生成的基序具有较高的灵敏度和特异性,并揭示了人类基因组调控代码的新见解。贪婪算法表现最好,为每个 ChIP-seq 转录因子组选择了中位数为两个基序,同时实现了中位数为 77%的灵敏度。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验