• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

系统聚类序列数据集上寡核苷酸特征的全面宽松搜索。

Comprehensive and relaxed search for oligonucleotide signatures in hierarchically clustered sequence datasets.

机构信息

Services Department of Informatics, Technische Universität München, Boltzmannstrasse 3, 85748 Garching, Germany.

出版信息

Bioinformatics. 2011 Jun 1;27(11):1546-54. doi: 10.1093/bioinformatics/btr161. Epub 2011 Apr 5.

DOI:10.1093/bioinformatics/btr161
PMID:21471017
Abstract

MOTIVATION

PCR, hybridization, DNA sequencing and other important methods in molecular diagnostics rely on both sequence-specific and sequence group-specific oligonucleotide primers and probes. Their design depends on the identification of oligonucleotide signatures in whole genome or marker gene sequences. Although genome and gene databases are generally available and regularly updated, collections of valuable signatures are rare. Even for single requests, the search for signatures becomes computationally expensive when working with large collections of target (and non-target) sequences. Moreover, with growing dataset sizes, the chance of finding exact group-matching signatures decreases, necessitating the application of relaxed search methods. The resultant substantial increase in complexity is exacerbated by the dearth of algorithms able to solve these problems efficiently.

RESULTS

We have developed CaSSiS, a fast and scalable method for computing comprehensive collections of sequence- and sequence group-specific oligonucleotide signatures from large sets of hierarchically clustered nucleic acid sequence data. Based on the ARB Positional Tree (PT-)Server and a newly developed BGRT data structure, CaSSiS not only determines sequence-specific signatures and perfect group-covering signatures for every node within the cluster (i.e. target groups), but also signatures with maximal group coverage (sensitivity) within a user-defined range of non-target hits (specificity) for groups lacking a perfect common signature. An upper limit of tolerated mismatches within the target group, as well as the minimum number of mismatches with non-target sequences, can be predefined. Test runs with one of the largest phylogenetic gene sequence datasets available indicate good runtime and memory performance, and in silico spot tests have shown the usefulness of the resulting signature sequences as blueprints for group-specific oligonucleotide probes.

AVAILABILITY

Software and Supplementary Material are available at http://cassis.in.tum.de/.

摘要

动机

PCR、杂交、DNA 测序和其他分子诊断中的重要方法都依赖于序列特异性和序列组特异性寡核苷酸引物和探针。它们的设计取决于整个基因组或标记基因序列中寡核苷酸特征的识别。虽然基因组和基因数据库通常是可用的,并定期更新,但有价值的特征集合却很少。即使是单个请求,在处理大量目标(和非目标)序列时,特征的搜索也会变得计算昂贵。此外,随着数据集规模的增长,找到完全匹配的组特征的机会减少,需要应用宽松的搜索方法。由于缺乏能够有效解决这些问题的算法,因此复杂性会大大增加。

结果

我们开发了 CaSSiS,这是一种从大型层次聚类核酸序列数据集中计算综合的序列和序列组特异性寡核苷酸特征集合的快速且可扩展的方法。基于 ARB 位置树(PT-)服务器和新开发的 BGRT 数据结构,CaSSiS 不仅确定了聚类内每个节点的序列特异性特征和完美的组覆盖特征(即目标组),而且还确定了在用户定义的非目标命中(特异性)范围内具有最大组覆盖(敏感性)的特征,对于缺乏完美公共特征的组。可以预定义目标组内允许的最大错配数以及与非目标序列的最小错配数。使用可用的最大系统发育基因序列数据集之一进行的测试运行表明了良好的运行时和内存性能,并且在计算机模拟点测试中表明了生成的特征序列作为组特异性寡核苷酸探针的蓝图的有用性。

可用性

软件和补充材料可在 http://cassis.in.tum.de/ 获得。

相似文献

1
Comprehensive and relaxed search for oligonucleotide signatures in hierarchically clustered sequence datasets.系统聚类序列数据集上寡核苷酸特征的全面宽松搜索。
Bioinformatics. 2011 Jun 1;27(11):1546-54. doi: 10.1093/bioinformatics/btr161. Epub 2011 Apr 5.
2
PTPan--overcoming memory limitations in oligonucleotide string matching for primer/probe design.PTPan--克服引物/探针设计中寡核苷酸序列匹配的记忆限制。
Bioinformatics. 2011 Oct 15;27(20):2797-805. doi: 10.1093/bioinformatics/btr483. Epub 2011 Aug 19.
3
Graphical representation of ribosomal RNA probe accessibility data using ARB software package.使用ARB软件包对核糖体RNA探针可及性数据进行图形化表示。
BMC Bioinformatics. 2005 Mar 21;6:61. doi: 10.1186/1471-2105-6-61.
4
PROBEmer: A web-based software tool for selecting optimal DNA oligos.PROBEmer:一种用于选择最佳DNA寡核苷酸的基于网络的软件工具。
Nucleic Acids Res. 2003 Jul 1;31(13):3746-50. doi: 10.1093/nar/gkg569.
5
Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences.大型基因序列集中的单个序列可以通过共享子序列的组合来有效区分。
BMC Bioinformatics. 2005 Apr 8;6:90. doi: 10.1186/1471-2105-6-90.
6
Design of long oligonucleotide probes for functional gene detection in a microbial community.用于微生物群落中功能基因检测的长寡核苷酸探针设计
Bioinformatics. 2005 Nov 15;21(22):4092-100. doi: 10.1093/bioinformatics/bti673. Epub 2005 Sep 13.
7
Cluster oligonucleotide signatures for rapid identification by sequencing.通过测序进行快速鉴定的寡核苷酸簇特征。
BMC Bioinformatics. 2018 Oct 29;19(1):395. doi: 10.1186/s12859-018-2363-3.
8
probeBase--an online resource for rRNA-targeted oligonucleotide probes and primers: new features 2016.probeBase——一个针对rRNA的寡核苷酸探针和引物的在线资源:2016年新特性
Nucleic Acids Res. 2016 Jan 4;44(D1):D586-9. doi: 10.1093/nar/gkv1232. Epub 2015 Nov 19.
9
Decoding non-unique oligonucleotide hybridization experiments of targets related by a phylogenetic tree.解码由系统发育树相关的靶标的非唯一寡核苷酸杂交实验。
Bioinformatics. 2006 Jul 15;22(14):e424-30. doi: 10.1093/bioinformatics/btl254.
10
YODA: selecting signature oligonucleotides.尤达:选择标志性寡核苷酸。
Bioinformatics. 2005 Apr 15;21(8):1365-70. doi: 10.1093/bioinformatics/bti182. Epub 2004 Nov 30.

引用本文的文献

1
Cluster oligonucleotide signatures for rapid identification by sequencing.通过测序进行快速鉴定的寡核苷酸簇特征。
BMC Bioinformatics. 2018 Oct 29;19(1):395. doi: 10.1186/s12859-018-2363-3.
2
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations.海王星:一种用于快速发现细菌群体基因组变异的生物信息学工具。
Nucleic Acids Res. 2017 Oct 13;45(18):e159. doi: 10.1093/nar/gkx702.
3
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.HTSFinder:通过并行和分布式计算发现DNA特征的强大流程。
Evol Bioinform Online. 2016 Feb 10;12:73-85. doi: 10.4137/EBO.S35545. eCollection 2016.
4
An algorithm of discovering signatures from DNA databases on a computer cluster.一种在计算机集群上从DNA数据库中发现特征序列的算法。
BMC Bioinformatics. 2014 Oct 5;15(1):339. doi: 10.1186/1471-2105-15-339.
5
PRISE2: software for designing sequence-selective PCR primers and probes.PRISE2:用于设计序列选择性PCR引物和探针的软件。
BMC Bioinformatics. 2014 Sep 25;15(1):317. doi: 10.1186/1471-2105-15-317.
6
PhylOPDb: a 16S rRNA oligonucleotide probe database for prokaryotic identification.PhylOPDb:用于原核生物鉴定的 16S rRNA 寡核苷酸探针数据库。
Database (Oxford). 2014 Apr 26;2014(0):bau036. doi: 10.1093/database/bau036. Print 2014.
7
A robust PCR primer design platform applied to the detection of Acidobacteria Group 1 in soil.一种用于检测土壤中酸杆菌第 1 群组的稳健 PCR 引物设计平台。
Nucleic Acids Res. 2012 Jul;40(12):e96. doi: 10.1093/nar/gks238. Epub 2012 Mar 20.
8
Improving probe set selection for microbial community analysis by leveraging taxonomic information of training sequences.通过利用训练序列的分类学信息来改进微生物群落分析的探针集选择。
BMC Bioinformatics. 2011 Oct 10;12:394. doi: 10.1186/1471-2105-12-394.