• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
A new statistic for efficient detection of repetitive sequences.一种用于高效检测重复序列的新统计方法。
Bioinformatics. 2019 Nov 1;35(22):4596-4606. doi: 10.1093/bioinformatics/btz262.
2
Assemble CRISPRs from metagenomic sequencing data.从宏基因组测序数据中组装CRISPRs。
Bioinformatics. 2016 Sep 1;32(17):i520-i528. doi: 10.1093/bioinformatics/btw456.
3
ScreenBEAM: a novel meta-analysis algorithm for functional genomics screens via Bayesian hierarchical modeling.ScreenBEAM:一种通过贝叶斯层次模型进行功能基因组筛选的新型荟萃分析算法。
Bioinformatics. 2016 Jan 15;32(2):260-7. doi: 10.1093/bioinformatics/btv556. Epub 2015 Sep 28.
4
[Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].志贺氏菌基因组中规律成簇间隔短回文重复序列的生物信息学分析
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2015 Apr;32(2):343-9.
5
RepLong: de novo repeat identification using long read sequencing data.RepLong:利用长读测序数据进行从头重复识别。
Bioinformatics. 2018 Apr 1;34(7):1099-1107. doi: 10.1093/bioinformatics/btx717.
6
SimkaMin: fast and resource frugal de novo comparative metagenomics.SimkaMin:快速且资源节约型从头生物群落比较基因组学。
Bioinformatics. 2020 Feb 15;36(4):1275-1276. doi: 10.1093/bioinformatics/btz685.
7
Survey of clustered regularly interspaced short palindromic repeats and their associated Cas proteins (CRISPR/Cas) systems in multiple sequenced strains of Klebsiella pneumoniae.肺炎克雷伯菌多个测序菌株中规律成簇间隔短回文重复序列及其相关Cas蛋白(CRISPR/Cas)系统的调查
BMC Res Notes. 2015 Aug 4;8:332. doi: 10.1186/s13104-015-1285-7.
8
A space and time-efficient index for the compacted colored de Bruijn graph.一种用于压缩彩色 de Bruijn 图的空间和时间高效索引。
Bioinformatics. 2018 Jul 1;34(13):i169-i177. doi: 10.1093/bioinformatics/bty292.
9
PRAWNS: compact pan-genomic features for whole-genome population genomics.对虾:全基因组群体基因组学的紧凑型泛基因组特征。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac844.
10
CRISPR Detection From Short Reads Using Partial Overlap Graphs.使用部分重叠图从短读段中进行CRISPR检测
J Comput Biol. 2016 Jun;23(6):461-71. doi: 10.1089/cmb.2015.0226. Epub 2016 Apr 8.

引用本文的文献

1
Complete Chloroplast Genome Sequences of Three Species: Genome Characterization, Comparative Analyses, and Phylogenetic Relationships Within Zingiberales.三种植物的完整叶绿体基因组序列:姜目内的基因组特征、比较分析和系统发育关系
Curr Issues Mol Biol. 2025 Mar 25;47(4):222. doi: 10.3390/cimb47040222.
2
First Complete Mitochondrial Genome Analysis of Tree Frog, and Comparison with .树蛙的首个线粒体全基因组分析及其与……的比较
Int J Mol Sci. 2025 Mar 7;26(6):2423. doi: 10.3390/ijms26062423.
3
Molecular Characterization and Phylogenetic Analysis of Centipedegrass [ (Munro) Hack.] Based on the Complete Chloroplast Genome Sequence.基于叶绿体全基因组序列的假俭草[(Munro)Hack.]分子特征及系统发育分析
Curr Issues Mol Biol. 2024 Feb 19;46(2):1635-1650. doi: 10.3390/cimb46020106.
4
Genomic sequence capture of Plasmodium relictum in experimentally infected birds.对实验感染鸟类中的疟原虫 relicta 进行基因组序列捕获。
Parasit Vectors. 2022 Jul 29;15(1):267. doi: 10.1186/s13071-022-05373-w.
5
BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data.BigFiRSt:一种使用大数据技术从大规模测序数据中挖掘简单序列重复序列的软件程序。
Front Big Data. 2022 Jan 18;4:727216. doi: 10.3389/fdata.2021.727216. eCollection 2021.

本文引用的文献

1
Systematic discovery of antiphage defense systems in the microbial pangenome.系统发现微生物泛基因组中的抗噬菌体防御系统。
Science. 2018 Mar 2;359(6379). doi: 10.1126/science.aar4120. Epub 2018 Jan 25.
2
RepLong: de novo repeat identification using long read sequencing data.RepLong:利用长读测序数据进行从头重复识别。
Bioinformatics. 2018 Apr 1;34(7):1099-1107. doi: 10.1093/bioinformatics/btx717.
3
Clustal Omega for making accurate alignments of many protein sequences.Clustal Omega用于对多个蛋白质序列进行精确比对。
Protein Sci. 2018 Jan;27(1):135-145. doi: 10.1002/pro.3290. Epub 2017 Oct 30.
4
New CRISPR-Cas systems from uncultivated microbes.来自未培养微生物的新型CRISPR-Cas系统。
Nature. 2017 Feb 9;542(7640):237-241. doi: 10.1038/nature21059. Epub 2016 Dec 22.
5
Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens.Perturb-Seq:通过对汇集基因筛选进行可扩展的单细胞RNA分析来剖析分子回路。
Cell. 2016 Dec 15;167(7):1853-1866.e17. doi: 10.1016/j.cell.2016.11.038.
6
Assemble CRISPRs from metagenomic sequencing data.从宏基因组测序数据中组装CRISPRs。
Bioinformatics. 2016 Sep 1;32(17):i520-i528. doi: 10.1093/bioinformatics/btw456.
7
Accurate Prediction of the Statistics of Repetitions in Random Sequences: A Case Study in Archaea Genomes.准确预测随机序列中的重复次数:以古菌基因组为例。
Front Bioeng Biotechnol. 2016 Jun 8;4:35. doi: 10.3389/fbioe.2016.00035. eCollection 2016.
8
CRISPR Detection From Short Reads Using Partial Overlap Graphs.使用部分重叠图从短读段中进行CRISPR检测
J Comput Biol. 2016 Jun;23(6):461-71. doi: 10.1089/cmb.2015.0226. Epub 2016 Apr 8.
9
REPdenovo: Inferring De Novo Repeat Motifs from Short Sequence Reads.REPdenovo:从短序列读取中推断从头重复基序
PLoS One. 2016 Mar 15;11(3):e0150719. doi: 10.1371/journal.pone.0150719. eCollection 2016.
10
Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.中国2型糖尿病患者和健康对照者肠道宏基因组样本中CRISPR基因座的计算预测
BMC Syst Biol. 2016 Jan 11;10 Suppl 1(Suppl 1):5. doi: 10.1186/s12918-015-0248-x.

一种用于高效检测重复序列的新统计方法。

A new statistic for efficient detection of repetitive sequences.

机构信息

Department of Automation, MOE Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic & Systems Biology, BNRist, Tsinghua University, Beijing 100084, China.

Quantitative and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089, USA.

出版信息

Bioinformatics. 2019 Nov 1;35(22):4596-4606. doi: 10.1093/bioinformatics/btz262.

DOI:10.1093/bioinformatics/btz262
PMID:30993316
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7963086/
Abstract

MOTIVATION

Detecting sequences containing repetitive regions is a basic bioinformatics task with many applications. Several methods have been developed for various types of repeat detection tasks. An efficient generic method for detecting most types of repetitive sequences is still desirable. Inspired by the excellent properties and successful applications of the D2 family of statistics in comparative analyses of genomic sequences, we developed a new statistic D2R that can efficiently discriminate sequences with or without repetitive regions.

RESULTS

Using the statistic, we developed an algorithm of linear time and space complexity for detecting most types of repetitive sequences in multiple scenarios, including finding candidate clustered regularly interspaced short palindromic repeats regions from bacterial genomic or metagenomics sequences. Simulation and real data experiments show that the method works well on both assembled sequences and unassembled short reads.

AVAILABILITY AND IMPLEMENTATION

The codes are available at https://github.com/XuegongLab/D2R_codes under GPL 3.0 license.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

检测包含重复区域的序列是具有许多应用的基本生物信息学任务。已经开发了几种用于各种类型的重复检测任务的方法。仍然需要一种用于检测大多数类型的重复序列的高效通用方法。受 D2 统计家族在基因组序列比较分析中出色的特性和成功应用的启发,我们开发了一种新的统计量 D2R,它可以有效地区分具有或不具有重复区域的序列。

结果

使用该统计量,我们开发了一种具有线性时间和空间复杂度的算法,用于在多种情况下检测大多数类型的重复序列,包括从细菌基因组或宏基因组序列中寻找候选簇状规则间隔短回文重复区。模拟和真实数据实验表明,该方法在组装序列和未组装的短读段上都能很好地工作。

可用性和实现

代码可在 GPL 3.0 许可证下在 https://github.com/XuegongLab/D2R_codes 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。