Singh Gautam B, Singh Harkirat
Computer Science and Engineering, Oakland University, Rochester, MI 48309, USA.
Mol Biotechnol. 2005 Feb;29(2):165-83. doi: 10.1385/MB:29:2:165.
A variety of patterns have been observed on the DNA and protein sequences that serve as control points for gene expression and cellular functions. Owing to the vital role of such patterns discovered on biological sequences, they are generally cataloged and maintained within internationally shared databases. Furthermore,the variability in a family of observed patterns is often represented using computational models in order to facilitate their search within an uncharacterized biological sequence. As the biological data is comprised of a mosaic of sequence-levels motifs, it is significant to unravel the synergies of macromolecular coordination utilized in cell-specific differential synthesis of proteins. This article provides an overview of the various pattern representation methodologies and the surveys the pattern databases available for use to the molecular biologists. Our aim is to describe the principles behind the computational modeling and analysis techniques utilized in bioinformatics research, with the objective of providing insight necessary to better understand and effectively utilize the available databases and analysis tools. We also provide a detailed review of DNA sequence level patterns responsible for structural conformations within the Scaffold or Matrix Attachment Regions (S/MARs).
在作为基因表达和细胞功能控制点的DNA和蛋白质序列上已观察到多种模式。由于在生物序列上发现的此类模式具有至关重要的作用,它们通常被编目并保存在国际共享数据库中。此外,为便于在未表征的生物序列中搜索,常常使用计算模型来表示观察到的模式家族中的变异性。由于生物数据由序列水平基序的镶嵌体组成,揭示细胞特异性蛋白质差异合成中利用的大分子协同作用具有重要意义。本文概述了各种模式表示方法,并调查了可供分子生物学家使用的模式数据库。我们的目的是描述生物信息学研究中使用的计算建模和分析技术背后的原理,以便提供深入了解,从而更好地理解和有效利用现有数据库及分析工具。我们还详细综述了负责支架或基质附着区域(S/MARs)内结构构象的DNA序列水平模式。