Suppr超能文献

大肠杆菌K12 DNA中调控蛋白结合位点检测阈值的评估。

Evaluation of thresholds for the detection of binding sites for regulatory proteins in Escherichia coli K12 DNA.

作者信息

Benítez-Bellón Esperanza, Moreno-Hagelsieb Gabriel, Collado-Vides Julio

机构信息

Program of Computational Genomics, CIFN, UNAM, A,P, 565-A, Cuernavaca, Morelos 62100, Mexico.

出版信息

Genome Biol. 2002;3(3):RESEARCH0013. doi: 10.1186/gb-2002-3-3-research0013. Epub 2002 Feb 21.

Abstract

BACKGROUND

Sites in DNA that bind regulatory proteins can be detected computationally in various ways. Pattern discovery methods analyze collections of genes suspected to be co-regulated on the evidence, for example, of clustering of transcriptome data. Pattern searching methods use sequences with known binding sites to find other genes regulated by a given protein. Such computational methods are important strategies in the discovery and elaboration of regulatory networks and can provide the experimental biologist with a precise prediction of a binding site or identify a gene as a member of a set of co-regulated genes (a regulon). As more variations on such methods are published, however, thorough evaluation is necessary, as performance may differ depending on the conditions of use. Detailed evaluation also helps to improve and understand the behavior of the different methods and computational strategies.

RESULTS

We used a collection of 86 regulons from Escherichia coli as datasets to evaluate two methods for pattern discovery and pattern searching: dyad analysis/dyad sweeping using the program Dyad-analysis, and multiple alignment using the programs Consensus/Patser. Clearly defined statistical parameters are used to evaluate the two methods in different situations. We placed particular emphasis on minimizing the rate of false positives.

CONCLUSIONS

As a general rule, sensors obtained from experimentally reported binding sites in DNA frequently locate true sites as the highest-scoring sequences within a given upstream region, especially using Consensus/Patser. Pattern discovery is still an unsolved problem, although in the cases where Dyad-analysis finds significant dyads (around 50%), these frequently correspond to true binding sites. With more robust methods, regulatory predictions could help identify the function of unknown genes.

摘要

背景

DNA中与调控蛋白结合的位点可以通过多种计算方法进行检测。模式发现方法基于例如转录组数据聚类等证据,分析疑似共同调控的基因集合。模式搜索方法利用具有已知结合位点的序列来寻找受给定蛋白质调控的其他基因。此类计算方法是发现和阐述调控网络的重要策略,可为实验生物学家提供结合位点的精确预测,或将一个基因鉴定为一组共同调控基因(一个调控子)的成员。然而,随着此类方法的更多变体被发表,由于性能可能因使用条件而异,因此有必要进行全面评估。详细评估还有助于改进和理解不同方法及计算策略的行为。

结果

我们使用来自大肠杆菌的86个调控子集合作为数据集,来评估两种模式发现和模式搜索方法:使用Dyad-analysis程序进行二分体分析/二分体扫描,以及使用Consensus/Patser程序进行多重比对。使用明确界定的统计参数在不同情况下评估这两种方法。我们特别强调将假阳性率降至最低。

结论

一般来说,从实验报道的DNA结合位点获得的传感器经常将真实位点定位为给定上游区域内得分最高的序列,尤其是使用Consensus/Patser时。模式发现仍然是一个未解决的问题,尽管在Dyad-analysis发现显著二分体的情况下(约50%),这些二分体经常对应于真实的结合位点。有了更强大的方法,调控预测有助于识别未知基因的功能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2ad/88811/cc699fcc1699/gb-2002-3-3-research0013-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验