Suppr超能文献

基于人工神经网络-遗传算法模型,利用基因芯片数据对拟南芥启动子进行预测

An ANN-GA model based promoter prediction in Arabidopsis thaliana using tilling microarray data.

作者信息

Mishra Hrishikesh, Singh Nitya, Misra Krishna, Lahiri Tapobrata

机构信息

Division of Applied Sciences and Indo-Russian Centre for Biotechnology, Indian Institute of Information Technology, Allahabad, India.

出版信息

Bioinformation. 2011;6(6):240-3. doi: 10.6026/97320630006240. Epub 2011 Jun 6.

Abstract

Identification of promoter region is an important part of gene annotation. Identification of promoters in eukaryotes is important as promoters modulate various metabolic functions and cellular stress responses. In this work, a novel approach utilizing intensity values of tilling microarray data for a model eukaryotic plant Arabidopsis thaliana, was used to specify promoter region from non-promoter region. A feed-forward back propagation neural network model supported by genetic algorithm was employed to predict the class of data with a window size of 41. A dataset comprising of 2992 data vectors representing both promoter and non-promoter regions, chosen randomly from probe intensity vectors for whole genome of Arabidopsis thaliana generated through tilling microarray technique was used. The classifier model shows prediction accuracy of 69.73% and 65.36% on training and validation sets, respectively. Further, a concept of distance based class membership was used to validate reliability of classifier, which showed promising results. The study shows the usability of micro-array probe intensities to predict the promoter regions in eukaryotic genomes.

摘要

启动子区域的识别是基因注释的重要组成部分。真核生物中启动子的识别很重要,因为启动子可调节各种代谢功能和细胞应激反应。在这项工作中,一种利用模式真核植物拟南芥的耕作微阵列数据强度值的新方法,被用于从非启动子区域中确定启动子区域。采用由遗传算法支持的前馈反向传播神经网络模型,以41的窗口大小预测数据类别。使用了一个数据集,该数据集由2992个代表启动子和非启动子区域的数据向量组成,这些数据向量是从通过耕作微阵列技术生成的拟南芥全基因组探针强度向量中随机选择的。分类器模型在训练集和验证集上的预测准确率分别为69.73%和65.36%。此外,基于距离的类成员概念被用于验证分类器的可靠性,结果很有前景。该研究表明微阵列探针强度可用于预测真核生物基因组中的启动子区域。

相似文献

本文引用的文献

1
Position-dependent motif characterization using non-negative matrix factorization.使用非负矩阵分解进行位置依赖基序表征
Bioinformatics. 2008 Dec 1;24(23):2684-90. doi: 10.1093/bioinformatics/btn526. Epub 2008 Oct 13.
4
Computational analyses of eukaryotic promoters.真核生物启动子的计算分析。
BMC Bioinformatics. 2007 Sep 27;8 Suppl 6(Suppl 6):S3. doi: 10.1186/1471-2105-8-S6-S3.
7
DNA dynamically directs its own transcription initiation.DNA动态地指导其自身的转录起始。
Nucleic Acids Res. 2004 Mar 5;32(4):1584-90. doi: 10.1093/nar/gkh335. Print 2004.
10
The RNA polymerase II core promoter.RNA聚合酶II核心启动子。
Annu Rev Biochem. 2003;72:449-79. doi: 10.1146/annurev.biochem.72.121801.161520. Epub 2003 Mar 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验