• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MCOIN:一种用于确定转录因子结合位点基序宽度的新型启发式方法。

MCOIN: a novel heuristic for determining transcription factor binding site motif width.

作者信息

Kilpatrick Alastair M, Ward Bruce, Aitken Stuart

机构信息

School of Informatics, University of Edinburgh, Informatics Forum, 10 Crichton Street, EH8 9AB Edinburgh, Scotland.

出版信息

Algorithms Mol Biol. 2013 Jun 27;8(1):16. doi: 10.1186/1748-7188-8-16.

DOI:10.1186/1748-7188-8-16
PMID:23806098
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3716798/
Abstract

BACKGROUND

In transcription factor binding site discovery, the true width of the motif to be discovered is generally not known a priori. The ability to compute the most likely width of a motif is therefore a highly desirable property for motif discovery algorithms. However, this is a challenging computational problem as a result of changing model dimensionality at changing motif widths. The complexity of the problem is increased as the discovered model at the true motif width need not be the most statistically significant in a set of candidate motif models. Further, the core motif discovery algorithm used cannot guarantee to return the best possible result at each candidate width.

RESULTS

We present MCOIN, a novel heuristic for automatically determining transcription factor binding site motif width, based on motif containment and information content. Using realistic synthetic data and previously characterised prokaryotic data, we show that MCOIN outperforms the current most popular method (E-value of the resulting multiple alignment) as a predictor of motif width, based on mean absolute error. MCOIN is also shown to choose models which better match known sites at higher levels of motif conservation, based on ROC analysis.

CONCLUSIONS

We demonstrate the performance of MCOIN as part of a deterministic motif discovery algorithm and conclude that MCOIN outperforms current methods for determining motif width.

摘要

背景

在转录因子结合位点发现中,待发现基序的真实宽度通常事先并不知晓。因此,对于基序发现算法而言,能够计算出最可能的基序宽度是一项非常理想的特性。然而,由于基序宽度变化时模型维度也会改变,这是一个具有挑战性的计算问题。该问题的复杂性还因在真实基序宽度下发现的模型在一组候选基序模型中不一定是统计意义最显著的而增加。此外,所使用的核心基序发现算法无法保证在每个候选宽度下都返回最佳可能结果。

结果

我们提出了MCOIN,一种基于基序包含和信息含量自动确定转录因子结合位点基序宽度的新型启发式方法。使用逼真的合成数据和先前已表征的原核生物数据,基于平均绝对误差,我们表明MCOIN作为基序宽度预测器优于当前最流行的方法(所得多序列比对的E值)。基于ROC分析,MCOIN还表明在更高基序保守水平下能选择与已知位点更匹配的模型。

结论

我们展示了MCOIN作为确定性基序发现算法一部分的性能,并得出结论:MCOIN在确定基序宽度方面优于当前方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/4ac8bed5343f/1748-7188-8-16-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/92c77d4be001/1748-7188-8-16-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/c623a07bfd85/1748-7188-8-16-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/95b3b11a66c4/1748-7188-8-16-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/4ac8bed5343f/1748-7188-8-16-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/92c77d4be001/1748-7188-8-16-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/c623a07bfd85/1748-7188-8-16-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/95b3b11a66c4/1748-7188-8-16-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ee5/3716798/4ac8bed5343f/1748-7188-8-16-4.jpg

相似文献

1
MCOIN: a novel heuristic for determining transcription factor binding site motif width.MCOIN:一种用于确定转录因子结合位点基序宽度的新型启发式方法。
Algorithms Mol Biol. 2013 Jun 27;8(1):16. doi: 10.1186/1748-7188-8-16.
2
Discovering multiple realistic TFBS motifs based on a generalized model.基于广义模型发现多个真实的 TFBS 基序。
BMC Bioinformatics. 2009 Oct 7;10:321. doi: 10.1186/1471-2105-10-321.
3
A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs.基于蒙特卡罗的框架增强了调控序列基序的发现和解释。
BMC Bioinformatics. 2012 Nov 27;13:317. doi: 10.1186/1471-2105-13-317.
4
A fast weak motif-finding algorithm based on community detection in graphs.基于图中社区检测的快速弱模式发现算法。
BMC Bioinformatics. 2013 Jul 17;14:227. doi: 10.1186/1471-2105-14-227.
5
Improved benchmarks for computational motif discovery.用于计算基序发现的改进基准。
BMC Bioinformatics. 2007 Jun 8;8:193. doi: 10.1186/1471-2105-8-193.
6
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PhyloGibbs:一种整合了系统发育的吉布斯采样基序查找器。
PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.
7
Stochastic EM-based TFBS motif discovery with MITSU.基于随机期望最大化的转录因子结合位点基序发现方法 MITSU。
Bioinformatics. 2014 Jun 15;30(12):i310-8. doi: 10.1093/bioinformatics/btu286.
8
A cluster refinement algorithm for motif discovery.一种用于发现模体的簇精炼算法。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):654-68. doi: 10.1109/TCBB.2009.25.
9
Integrating multiple evidence sources to predict transcription factor binding in the human genome.整合多个证据来源以预测人类基因组中的转录因子结合
Genome Res. 2010 Apr;20(4):526-36. doi: 10.1101/gr.096305.109. Epub 2010 Mar 10.
10
Inferring intra-motif dependencies of DNA binding sites from ChIP-seq data.从ChIP-seq数据推断DNA结合位点的基序内依赖性。
BMC Bioinformatics. 2015 Nov 9;16:375. doi: 10.1186/s12859-015-0797-4.

引用本文的文献

1
Stochastic EM-based TFBS motif discovery with MITSU.基于随机期望最大化的转录因子结合位点基序发现方法 MITSU。
Bioinformatics. 2014 Jun 15;30(12):i310-8. doi: 10.1093/bioinformatics/btu286.

本文引用的文献

1
Functional analysis of transcription factor binding sites in human promoters.转录因子结合位点在人类启动子中的功能分析。
Genome Biol. 2012 Sep 26;13(9):R50. doi: 10.1186/gb-2012-13-9-r50.
2
Analysis of variation at transcription factor binding sites in Drosophila and humans.分析果蝇和人类转录因子结合位点的变异。
Genome Biol. 2012 Sep 28;13(9):R49. doi: 10.1186/gb-2012-13-9-r49.
3
Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors.基于超过 100 个与转录相关的因子的实验确定的结合位点对人类基因组区域进行分类。
Genome Biol. 2012 Sep 26;13(9):R48. doi: 10.1186/gb-2012-13-9-r48.
4
Characterization of the RpoN regulon reveals differential regulation of T6SS and new flagellar operons in Vibrio cholerae O37 strain V52.RpoN 调控子的特征分析揭示了霍乱弧菌 O37 株 V52 中 T6SS 和新的鞭毛操纵子的差异调控。
Nucleic Acids Res. 2012 Sep;40(16):7766-75. doi: 10.1093/nar/gks567. Epub 2012 Jun 20.
5
Global analysis of the regulon of the transcriptional repressor LexA, a key component of SOS response in Mycobacterium tuberculosis.结核分枝杆菌 SOS 反应关键组成部分 LexA 转录抑制剂调控因子的全局分析。
J Biol Chem. 2012 Jun 22;287(26):22004-14. doi: 10.1074/jbc.M112.357715. Epub 2012 Apr 23.
6
Mapping the regulon of Vibrio cholerae ferric uptake regulator expands its known network of gene regulation.绘制霍乱弧菌铁摄取调节因子的调控网络,扩展了其已知的基因调控网络。
Proc Natl Acad Sci U S A. 2011 Jul 26;108(30):12467-72. doi: 10.1073/pnas.1107894108. Epub 2011 Jul 12.
7
The PurR regulon in Escherichia coli K-12 MG1655.大肠杆菌 K-12 MG1655 中的 PurR 调控组。
Nucleic Acids Res. 2011 Aug;39(15):6456-64. doi: 10.1093/nar/gkr307. Epub 2011 May 13.
8
RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units).RegulonDB 7.0版本:整合在遗传感应反应单元(Gensor单元)内的大肠杆菌K-12转录调控。
Nucleic Acids Res. 2011 Jan;39(Database issue):D98-105. doi: 10.1093/nar/gkq1110. Epub 2010 Nov 4.
9
The value of position-specific priors in motif discovery using MEME.MEME 中位置特异性先验在基序发现中的价值。
BMC Bioinformatics. 2010 Apr 9;11:179. doi: 10.1186/1471-2105-11-179.
10
A blind deconvolution approach to high-resolution mapping of transcription factor binding sites from ChIP-seq data.一种从 ChIP-seq 数据中高分辨率映射转录因子结合位点的盲去卷积方法。
Genome Biol. 2009;10(12):R142. doi: 10.1186/gb-2009-10-12-r142. Epub 2009 Dec 22.