Suppr超能文献

一种将序列基序搜索与关键词相结合以查找包含DNA序列重复片段的软件程序。

A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences.

作者信息

Bilgen Mehmet, Karaca Mehmet, Onus A Naci, Ince Ayşe Gül

机构信息

Faculty of Agriculture, Akdeniz University, 07059 Antalya, Turkey.

出版信息

Bioinformatics. 2004 Dec 12;20(18):3379-86. doi: 10.1093/bioinformatics/bth410. Epub 2004 Jul 15.

Abstract

MOTIVATION

One of the most interesting features of genomes (both coding and non-coding regions) is the presence of relatively short tandemly repeated DNA sequences known as tandem repeats (TRs). We developed a new PC-based stand-alone software analysis program, combining sequence motif searches with keywords such as organs, tissues, cell lines or development stages for finding exact, inexact and compound, TRs. Tandem Repeats Analyzer 1.5 (TRA) has several advanced repeat search parameters/options over other repeat finder programs as it does not only accept GenBank, FASTA and expressed sequence tag (EST) sequence files but also does analysis of multifiles with multisequences. Advanced user-defined parameters/options let the researchers use different motif lengths search criteria for varying motif lengths simultaneously. The outputs show statistical results to be evaluated by the user. The discovery of TRs in ESTs could be useful for both gene mapping and association studies and discovering TRs located in coding regions of important genes that are expressed under various conditions of environment, stress, organ, tissue and development stage.

RESULTS

In this paper, we demonstrated applications of TRA using 175 899 ESTs sequences for three Arabidopsis spp. downloaded from GenBank. The EST-SSRs/ESTs ratios were found 43.1%, 15.3% and 2.34% in A.lyrata, A.thaliana and A.halleri, respectively. Analysis revealed that organs, tissues and development stages possessed different amounts of repeats and repeat compositions. This indicated that the distribution of TRs among the tissues or organs may not be random differing from the untranscribed repeats found in genomes.

AVAILABILITY

The program can be obtained free by anonymous FTP from ftp.akdeniz.edu.tr/Araclar/TRA.

摘要

动机

基因组(包括编码区和非编码区)最有趣的特征之一是存在相对较短的串联重复DNA序列,即串联重复序列(TRs)。我们开发了一种基于个人电脑的独立软件分析程序,将序列基序搜索与器官、组织、细胞系或发育阶段等关键词相结合,以查找精确、不精确和复合的TRs。串联重复序列分析器1.5(TRA)比其他重复序列查找程序具有几个先进的重复序列搜索参数/选项,因为它不仅接受GenBank、FASTA和表达序列标签(EST)序列文件,还能对包含多个序列的多个文件进行分析。高级用户定义参数/选项使研究人员能够同时针对不同的基序长度使用不同的基序长度搜索标准。输出结果显示统计结果供用户评估。在EST中发现TRs对于基因定位和关联研究以及发现位于重要基因编码区的TRs可能是有用的,这些重要基因在环境、压力、器官、组织和发育阶段的各种条件下表达。

结果

在本文中,我们展示了TRA使用从GenBank下载的175899条三种拟南芥属物种的EST序列的应用情况。在琴叶拟南芥、拟南芥和盐芥中分别发现EST-SSR/EST比率为43.1%、15.3%和2.34%。分析表明,器官、组织和发育阶段具有不同数量的重复序列和重复组成。这表明TRs在组织或器官之间的分布可能不是随机的,这与基因组中发现的未转录重复序列不同。

可用性

该程序可通过匿名FTP从ftp.akdeniz.edu.tr/Araclar/TRA免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验