• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Compo:使用离散模型进行复合基序发现

Compo: composite motif discovery using discrete models.

作者信息

Sandve Geir Kjetil, Abul Osman, Drabløs Finn

机构信息

Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway.

出版信息

BMC Bioinformatics. 2008 Dec 8;9:527. doi: 10.1186/1471-2105-9-527.

DOI:10.1186/1471-2105-9-527
PMID:19063744
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2614996/
Abstract

BACKGROUND

Computational discovery of motifs in biomolecular sequences is an established field, with applications both in the discovery of functional sites in proteins and regulatory sites in DNA. In recent years there has been increased attention towards the discovery of composite motifs, typically occurring in cis-regulatory regions of genes.

RESULTS

This paper describes Compo: a discrete approach to composite motif discovery that supports richer modeling of composite motifs and a more realistic background model compared to previous methods. Furthermore, multiple parameter and threshold settings are tested automatically, and the most interesting motifs across settings are selected. This avoids reliance on single hard thresholds, which has been a weakness of previous discrete methods. Comparison of motifs across parameter settings is made possible by the use of p-values as a general significance measure. Compo can either return an ordered list of motifs, ranked according to the general significance measure, or a Pareto front corresponding to a multi-objective evaluation on sensitivity, specificity and spatial clustering.

CONCLUSION

Compo performs very competitively compared to several existing methods on a collection of benchmark data sets. These benchmarks include a recently published, large benchmark suite where the use of support across sequences allows Compo to correctly identify binding sites even when the relevant PWMs are mixed with a large number of noise PWMs. Furthermore, the possibility of parameter-free running offers high usability, the support for multi-objective evaluation allows a rich view of potential regulators, and the discrete model allows flexibility in modeling and interpretation of motifs.

摘要

背景

生物分子序列中基序的计算发现是一个成熟的领域,在蛋白质功能位点和DNA调控位点的发现中均有应用。近年来,人们越来越关注复合基序的发现,复合基序通常出现在基因的顺式调控区域。

结果

本文描述了Compo:一种用于复合基序发现的离散方法,与以前的方法相比,它支持对复合基序进行更丰富的建模以及更现实的背景模型。此外,会自动测试多个参数和阈值设置,并选择各设置中最有趣的基序。这避免了依赖单一硬阈值,而这一直是以前离散方法的一个弱点。通过使用p值作为一般显著性度量,可以对不同参数设置下的基序进行比较。Compo既可以返回根据一般显著性度量排序的基序列表,也可以返回对应于对敏感性、特异性和空间聚类进行多目标评估的帕累托前沿。

结论

在一组基准数据集上,Compo与几种现有方法相比具有很强的竞争力。这些基准包括最近发布的一个大型基准套件,在该套件中,跨序列使用支持使得Compo即使在相关位置权重矩阵(PWM)与大量噪声PWM混合的情况下也能正确识别结合位点。此外,无参数运行的可能性提供了高可用性,对多目标评估的支持允许对潜在调控因子有更全面的了解,并且离散模型在基序建模和解释方面具有灵活性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/7783f4b20617/1471-2105-9-527-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/7beaae7f446f/1471-2105-9-527-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/1da188f8f584/1471-2105-9-527-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/7783f4b20617/1471-2105-9-527-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/7beaae7f446f/1471-2105-9-527-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/1da188f8f584/1471-2105-9-527-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c6be/2614996/7783f4b20617/1471-2105-9-527-3.jpg

相似文献

1
Compo: composite motif discovery using discrete models.Compo:使用离散模型进行复合基序发现
BMC Bioinformatics. 2008 Dec 8;9:527. doi: 10.1186/1471-2105-9-527.
2
Improved benchmarks for computational motif discovery.用于计算基序发现的改进基准。
BMC Bioinformatics. 2007 Jun 8;8:193. doi: 10.1186/1471-2105-8-193.
3
Assessment of composite motif discovery methods.复合基序发现方法的评估。
BMC Bioinformatics. 2008 Feb 26;9:123. doi: 10.1186/1471-2105-9-123.
4
GADEM: a genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery.GADEM:一种遗传算法引导的间隔二元组形成,结合期望最大化算法用于基序发现。
J Comput Biol. 2009 Feb;16(2):317-29. doi: 10.1089/cmb.2008.16TT.
5
GAME: detecting cis-regulatory elements using a genetic algorithm.GAME:使用遗传算法检测顺式调控元件
Bioinformatics. 2006 Jul 1;22(13):1577-84. doi: 10.1093/bioinformatics/btl147. Epub 2006 Apr 21.
6
HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.HeliCis:一种用于发现具有周期性间隔的共定位基序对的DNA基序发现工具。
BMC Bioinformatics. 2007 Oct 28;8:418. doi: 10.1186/1471-2105-8-418.
7
Variable structure motifs for transcription factor binding sites.转录因子结合位点的变构基序。
BMC Genomics. 2010 Jan 14;11:30. doi: 10.1186/1471-2164-11-30.
8
MODSIDE: a motif discovery pipeline and similarity detector.MODSIDE:一种基序发现管道和相似度探测器。
BMC Genomics. 2018 Oct 19;19(1):755. doi: 10.1186/s12864-018-5148-1.
9
PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.PhyloGibbs:一种整合了系统发育的吉布斯采样基序查找器。
PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.
10
Transcription factor and microRNA motif discovery: the Amadeus platform and a compendium of metazoan target sets.转录因子和微小RNA基序发现:阿马德乌斯平台及后生动物靶标集汇编
Genome Res. 2008 Jul;18(7):1180-9. doi: 10.1101/gr.076117.108. Epub 2008 Apr 14.

引用本文的文献

1
COPS: detecting co-occurrence and spatial arrangement of transcription factor binding motifs in genome-wide datasets.COPS:在全基因组数据集中检测转录因子结合基序的共现和空间排列。
PLoS One. 2012;7(12):e52055. doi: 10.1371/journal.pone.0052055. Epub 2012 Dec 18.
2
Unveiling combinatorial regulation through the combination of ChIP information and in silico cis-regulatory module detection.揭示组合调控:结合 ChIP 信息和计算机 cis 调控模块检测。
Nucleic Acids Res. 2012 Jul;40(12):e90. doi: 10.1093/nar/gks237. Epub 2012 Mar 15.
3
Comparative analysis of cis-regulation following stroke and seizures in subspaces of conserved eigensystems.

本文引用的文献

1
Assessment of composite motif discovery methods.复合基序发现方法的评估。
BMC Bioinformatics. 2008 Feb 26;9:123. doi: 10.1186/1471-2105-9-123.
2
Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs.在没有先验知识的情况下,在果蝇中进行顺式调控模块的计算发现。
Genome Biol. 2008 Jan 28;9(1):R22. doi: 10.1186/gb-2008-9-1-r22.
3
JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update.JASPAR,转录因子结合谱的开放获取数据库:2008年更新中的新内容和工具。
中风和癫痫发作后保守特征系统子空间中顺式调控的比较分析。
BMC Syst Biol. 2010 Jun 17;4:86. doi: 10.1186/1752-0509-4-86.
Nucleic Acids Res. 2008 Jan;36(Database issue):D102-6. doi: 10.1093/nar/gkm955. Epub 2007 Nov 15.
4
A survey of motif discovery methods in an integrated framework.在一个集成框架中对基序发现方法的调查。
Biol Direct. 2006 Apr 6;1:11. doi: 10.1186/1745-6150-1-11.
5
Composite Module Analyst: a fitness-based tool for identification of transcription factor binding site combinations.复合模块分析:一种基于适应性的转录因子结合位点组合识别工具。
Bioinformatics. 2006 May 15;22(10):1190-7. doi: 10.1093/bioinformatics/btl041. Epub 2006 Feb 10.
6
TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes.TRANSFAC及其模块TRANSCompel:真核生物中的转录基因调控
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D108-10. doi: 10.1093/nar/gkj143.
7
REDfly: a Regulatory Element Database for Drosophila.REDfly:果蝇调控元件数据库。
Bioinformatics. 2006 Feb 1;22(3):381-3. doi: 10.1093/bioinformatics/bti794. Epub 2005 Nov 22.
8
TOUCAN 2: the all-inclusive open source workbench for regulatory sequence analysis.TOUCAN 2:用于调控序列分析的全功能开源工作台。
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W393-6. doi: 10.1093/nar/gki354.
9
TAMO: a flexible, object-oriented framework for analyzing transcriptional regulation using DNA-sequence motifs.TAMO:一个用于使用DNA序列基序分析转录调控的灵活的面向对象框架。
Bioinformatics. 2005 Jul 15;21(14):3164-5. doi: 10.1093/bioinformatics/bti481. Epub 2005 May 19.
10
De novo cis-regulatory module elicitation for eukaryotic genomes.真核生物基因组的从头顺式调控模块诱导
Proc Natl Acad Sci U S A. 2005 May 17;102(20):7079-84. doi: 10.1073/pnas.0408743102. Epub 2005 May 9.