Suppr超能文献

黑腹果蝇发育基因中转录因子结合位点的计算注释

Computational annotation of transcription factor binding sites in D. Melanogaster developmental genes.

作者信息

Narang Vipin, Sung Wing-Kin, Mittal Ankush

机构信息

Department of Computer Science, 3 Science Drive 2, National University of Singapore, 117543, Singapore.

出版信息

Genome Inform. 2006;17(2):14-24.

Abstract

Drosophila melanogaster is one of the most important organisms for studying the genetics of development. The precise regulation of genes during early development is enacted through the control of transcription. The control circuitry is hardwired in the genome as clusters of multiple transcription factor binding sites (TFBS) known as cis-regulatory modules (CRMs). A number of TFBS and CRMs have been experimentally annotated in the Drosophila genome. Currently about 661 CRM sequences are known, of which 155 have been annotated with 778 TFBS. This work attempts computational annotation of TFBS in the remaining 506 uncharacterized Drosophila CRMs. The difficulty of this task lies in the fact that experimental data is insufficient for constructing reliable positional weight matrices (PWM) to predict the TFBS. Thus a novel feature extraction and classification method for TFBS detection has been implemented in this work. The method achieves both high sensitivity and low false positive rate in cross-validation studies. As a result of this work, a new database has been compiled which aggregates all the CRM and TFBS annotation information for Drosophila available to date, and appends new TFBS annotations.

摘要

黑腹果蝇是研究发育遗传学最重要的生物之一。早期发育过程中基因的精确调控是通过转录控制来实现的。控制电路以多个转录因子结合位点(TFBS)簇的形式硬连接在基因组中,这些簇被称为顺式调控模块(CRM)。在果蝇基因组中,已经通过实验注释了许多TFBS和CRM。目前已知约661个CRM序列,其中155个已用778个TFBS进行了注释。这项工作尝试对其余506个未表征的果蝇CRM中的TFBS进行计算注释。这项任务的困难在于,实验数据不足以构建可靠的位置权重矩阵(PWM)来预测TFBS。因此,这项工作中实现了一种用于TFBS检测的新颖特征提取和分类方法。该方法在交叉验证研究中实现了高灵敏度和低假阳性率。这项工作的结果是,汇编了一个新数据库,该数据库汇总了迄今为止可用的所有果蝇CRM和TFBS注释信息,并附加了新的TFBS注释。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验