Genomics and Computational Biology Research Group, Faculty of Computing, Engineering and Science, University of South Wales, Pontypridd, UK.
Bioinformatics. 2013 Sep 1;29(17):2203-5. doi: 10.1093/bioinformatics/btt366. Epub 2013 Jun 21.
One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest.
cisExpress is available at www.cisexpress.org.
当代生物信息学的主要挑战之一是分析和准确注释基因组数据集,以提取有关 DNA 序列功能作用的有用信息。本文描述了一种新颖的基于相似性的全基因组统计方法,用于检测特定的 DNA 序列基序,这些相似性基于相似表达基因的启动子之间的相似性。这个新工具 cisExpress 是专门为处理大型数据集而设计的,例如由公共访问的全基因组和转录组项目生成的数据集。cisExpress 使用任务耕作算法来利用共享内存节点中所有可用的计算核心。我们展示了所提出方法的稳健性和有效性。它适用于任何感兴趣的物种的广泛基因组数据库。
cisExpress 可在 www.cisexpress.org 获得。