• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于DNA比较和聚类的改进工具。

Improved tools for DNA comparison and clustering.

作者信息

Parsons J D

机构信息

Genome Sequencing Center, Washington University School of Medicine, St Louis, MO 63108, USA.

出版信息

Comput Appl Biosci. 1995 Dec;11(6):603-13. doi: 10.1093/bioinformatics/11.6.603.

DOI:10.1093/bioinformatics/11.6.603
PMID:8808576
Abstract

DNA sequence clustering is an effective aid of the comprehension, summarization and compression of DNA sequence databases. Previous work created programs suitable for the comparison and clustering of cDNA sequences but new enhanced programs have been written to cluster genomic DNA fragments, large EST projects, and entire DNA databases. Three new programs (ICAtools) are discussed: ICAass, N2tool, and ICAmatches. ICAass has been used to compress the EMBL database by hiding or removing sequences with various degrees of redundancy. It also has the fastest database querying mode. N2tool provides fast and sensitive clustering of genomic fragment databases on the basis of small areas of local similarity. N2tool has proven utility in the discovery of contaminating vector or other artefactual sequence when the potential contaminant is not otherwise known. ICAmatches is a new cluster analysis program that uses a novel alignment style to present multiple alignment summaries. All the tools are convenient to use because they share a common memory-frugal index format and accept most DNA sequence formats directly.

摘要

DNA序列聚类是理解、总结和压缩DNA序列数据库的有效辅助手段。先前的工作创建了适用于cDNA序列比较和聚类的程序,但现在已经编写了新的增强程序来对基因组DNA片段、大型EST项目和整个DNA数据库进行聚类。本文讨论了三个新程序(ICAtools):ICAass、N2tool和ICAmatches。ICAass通过隐藏或去除具有不同程度冗余的序列来压缩EMBL数据库。它还具有最快的数据库查询模式。N2tool基于局部相似性的小区域对基因组片段数据库进行快速且灵敏的聚类。当潜在污染物未知时,N2tool已被证明在发现污染载体或其他人为序列方面很有用。ICAmatches是一个新的聚类分析程序,它使用一种新颖的比对方式来呈现多重比对总结。所有这些工具都便于使用,因为它们共享一种节省内存的通用索引格式,并且直接接受大多数DNA序列格式。

相似文献

1
Improved tools for DNA comparison and clustering.用于DNA比较和聚类的改进工具。
Comput Appl Biosci. 1995 Dec;11(6):603-13. doi: 10.1093/bioinformatics/11.6.603.
2
A RAPID algorithm for sequence database comparisons: application to the identification of vector contamination in the EMBL databases.一种用于序列数据库比较的快速算法:应用于识别EMBL数据库中的载体污染。
Bioinformatics. 1999 Feb;15(2):111-21. doi: 10.1093/bioinformatics/15.2.111.
3
CLEANUP: a fast computer program for removing redundancies from nucleotide sequence databases.清理程序(CLEANUP):一款用于去除核苷酸序列数据库冗余信息的快速计算机程序。
Comput Appl Biosci. 1996 Feb;12(1):1-8. doi: 10.1093/bioinformatics/12.1.1.
4
Miropeats: graphical DNA sequence comparisons.Miropeats:图形化DNA序列比较
Comput Appl Biosci. 1995 Dec;11(6):615-9. doi: 10.1093/bioinformatics/11.6.615.
5
Using the FASTA program to search protein and DNA sequence databases.使用FASTA程序搜索蛋白质和DNA序列数据库。
Methods Mol Biol. 1994;25:365-89. doi: 10.1385/0-89603-276-0:365.
6
ADVANCE and ADAM: two algorithms for the analysis of global similarity between homologous informational sequences.ADVANCE和ADAM:两种用于分析同源信息序列之间全局相似性的算法。
Comput Appl Biosci. 1994 Feb;10(1):3-5. doi: 10.1093/bioinformatics/10.1.3.
7
EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA.EST_GENOME:一个将剪接后的DNA序列与未剪接的基因组DNA进行比对的程序。
Comput Appl Biosci. 1997 Aug;13(4):477-8. doi: 10.1093/bioinformatics/13.4.477.
8
Multiple structural alignment and clustering of RNA sequences.RNA序列的多重结构比对与聚类
Bioinformatics. 2007 Apr 15;23(8):926-32. doi: 10.1093/bioinformatics/btm049. Epub 2007 Feb 25.
9
Methods for comparing a DNA sequence with a protein sequence.将DNA序列与蛋白质序列进行比较的方法。
Comput Appl Biosci. 1996 Dec;12(6):497-506. doi: 10.1093/bioinformatics/12.6.497.
10
Post-processing of BLAST results using databases of clustered sequences.使用聚类序列数据库对BLAST结果进行后处理。
Comput Appl Biosci. 1997 Feb;13(1):81-7. doi: 10.1093/bioinformatics/13.1.81.

引用本文的文献

1
Development and Validation of Single Nucleotide Polymorphism (SNP) Markers from an Expressed Sequence Tag (EST) Database in Olive Flounder (Paralichthys olivaceus).基于牙鲆(Paralichthys olivaceus)表达序列标签(EST)数据库开发和验证单核苷酸多态性(SNP)标记
Dev Reprod. 2014 Dec;18(4):275-86. doi: 10.12717/devrep.2014.18.4.275.
2
The Expression Analysis of Complement Component C3 during Early Developmental Stages in Olive Flounder (Paralichthys olivaceus).牙鲆(Paralichthys olivaceus)早期发育阶段补体成分C3的表达分析
Dev Reprod. 2013 Dec;17(4):311-9. doi: 10.12717/DR.2013.17.4.311.
3
Expression Analysis of Cathepsin F during Embryogenesis and Early Developmental Stage in Olive Flounder (Paralichthys olivaceus).
牙鲆胚胎发育及幼鱼早期发育阶段组织蛋白酶F的表达分析
Dev Reprod. 2013 Sep;17(3):221-9. doi: 10.12717/DR.2013.17.3.221.
4
Recurrent deletions and reciprocal duplications of 10q11.21q11.23 including CHAT and SLC18A3 are likely mediated by complex low-copy repeats.10q11.21q11.23 包括 CHAT 和 SLC18A3 的反复缺失和相互重复可能由复杂的低拷贝重复介导。
Hum Mutat. 2012 Jan;33(1):165-79. doi: 10.1002/humu.21614. Epub 2011 Nov 2.
5
A grammar-based distance metric enables fast and accurate clustering of large sets of 16S sequences.基于语法的距离度量能够快速、准确地对大量 16S 序列进行聚类。
BMC Bioinformatics. 2010 Dec 17;11:601. doi: 10.1186/1471-2105-11-601.
6
A genome-wide survey of segmental duplications that mediate common human genetic variation of chromosomal architecture.一项介导人类染色体结构常见遗传变异的节段性重复的全基因组调查。
Hum Genomics. 2004 Aug;1(5):335-44. doi: 10.1186/1479-7364-1-5-335.
7
EST and microarray analyses of pathogen-responsive genes in hot pepper (Capsicum annuum L.) non-host resistance against soybean pustule pathogen (Xanthomonas axonopodis pv. glycines).辣椒(Capsicum annuum L.)对大豆 pustule 病原菌(Xanthomonas axonopodis pv. glycines)非寄主抗性中病原菌响应基因的 EST 和微阵列分析
Funct Integr Genomics. 2004 Jul;4(3):196-205. doi: 10.1007/s10142-003-0099-1. Epub 2004 Feb 4.
8
Fugu ESTs: new resources for transcription analysis and genome annotation.河豚鱼的表达序列标签:转录分析和基因组注释的新资源。
Genome Res. 2003 Dec;13(12):2747-53. doi: 10.1101/gr.1691503. Epub 2003 Nov 12.
9
d2_cluster: a validated method for clustering EST and full-length cDNAsequences.d2聚类:一种用于对EST和全长cDNA序列进行聚类的有效方法。
Genome Res. 1999 Nov;9(11):1135-42. doi: 10.1101/gr.9.11.1135.
10
Generation and analysis of 25 Mb of genomic DNA from the pufferfish Fugu rubripes by sequence scanning.通过序列扫描生成并分析来自红鳍东方鲀的25兆碱基基因组DNA。
Genome Res. 1999 Oct;9(10):960-71. doi: 10.1101/gr.9.10.960.