文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

PERF:一种从大型 DNA 序列中进行超快速和高效微卫星识别的穷举算法。

PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences.

机构信息

CSIR - Centre for Cellular and Molecular Biology, Hyderabad, Telangana 500007, India.

出版信息

Bioinformatics. 2018 Mar 15;34(6):943-948. doi: 10.1093/bioinformatics/btx721.


DOI:10.1093/bioinformatics/btx721
PMID:29121165
Abstract

MOTIVATION: Microsatellites or Simple Sequence Repeats (SSRs) are short tandem repeats of DNA motifs present in all genomes. They have long been used for a variety of purposes in the areas of population genetics, genotyping, marker-assisted selection and forensics. Numerous studies have highlighted their functional roles in genome organization and gene regulation. Though several tools are currently available to identify SSRs from genomic sequences, they have significant limitations. RESULTS: We present a novel algorithm called PERF for extremely fast and comprehensive identification of microsatellites from DNA sequences of any size. PERF is several fold faster than existing algorithms and uses up to 5-fold lesser memory. It provides a clean and flexible command-line interface to change the default settings, and produces output in an easily-parseable tab-separated format. In addition, PERF generates an interactive and stand-alone HTML report with charts and tables for easy downstream analysis. AVAILABILITY AND IMPLEMENTATION: PERF is implemented in the Python programming language. It is freely available on PyPI under the package name perf_ssr, and can be installed directly using pip or easy_install. The documentation of PERF is available at https://github.com/rkmlab/perf. The source code of PERF is deposited in GitHub at https://github.com/rkmlab/perf under an MIT license. CONTACT: tej@ccmb.res.in. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

摘要

动机:微卫星或简单重复序列 (SSR) 是存在于所有基因组中的 DNA 基序的短串联重复。它们长期以来一直被用于群体遗传学、基因分型、标记辅助选择和法医学等领域的各种目的。许多研究强调了它们在基因组组织和基因调控中的功能作用。尽管目前有几种工具可用于从基因组序列中识别 SSR,但它们存在显著的局限性。

结果:我们提出了一种名为 PERF 的新算法,用于从任何大小的 DNA 序列中极其快速和全面地识别微卫星。PERF 比现有算法快几倍,使用的内存少 5 倍。它提供了一个干净灵活的命令行接口来更改默认设置,并以易于解析的制表符分隔格式生成输出。此外,PERF 生成带有图表和表格的交互式和独立的 HTML 报告,便于下游分析。

可用性和实现:PERF 是用 Python 编程语言实现的。它在 PyPI 上以 perf_ssr 包的名称免费提供,并可以使用 pip 或 easy_install 直接安装。PERF 的文档可在 https://github.com/rkmlab/perf 上获得。PERF 的源代码存放在 GitHub 上,位于 https://github.com/rkmlab/perf 下,采用 MIT 许可证。

联系人:tej@ccmb.res.in。

补充信息:补充数据可在 Bioinformatics 在线获得。

相似文献

[1]
PERF: an exhaustive algorithm for ultra-fast and efficient identification of microsatellites from large DNA sequences.

Bioinformatics. 2018-3-15

[2]
Krait: an ultrafast tool for genome-wide survey of microsatellites and primer design.

Bioinformatics. 2018-2-15

[3]
SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences.

Bioinformatics. 2016-9-1

[4]
Kmer-SSR: a fast and exhaustive SSR search algorithm.

Bioinformatics. 2017-12-15

[5]
TSSV: a tool for characterization of complex allelic variants in pure and mixed genomes.

Bioinformatics. 2014-2-13

[6]
Goldilocks: a tool for identifying genomic regions that are 'just right'.

Bioinformatics. 2016-7-1

[7]
Mosdepth: quick coverage calculation for genomes and exomes.

Bioinformatics. 2018-3-1

[8]
GSEApy: a comprehensive package for performing gene set enrichment analysis in Python.

Bioinformatics. 2023-1-1

[9]
Efficient population-scale variant analysis and prioritization with VAPr.

Bioinformatics. 2018-8-15

[10]
ntCard: a streaming algorithm for cardinality estimation in genomics data.

Bioinformatics. 2017-5-1

引用本文的文献

[1]
Pytrf: a python package for finding tandem repeats from genomic sequences.

BMC Bioinformatics. 2025-6-4

[2]
FOGS: A SNPSTR Marker Database to Combat Wildlife Trafficking and a Cell Culture Bank for Ex-Situ Conservation.

Mol Ecol Resour. 2025-5

[3]
Characterization of hAT DNA transposon superfamily in the genome of Neotropical fish Apareiodon sp.

Mol Genet Genomics. 2024-10-9

[4]
Streamlining of Simple Sequence Repeat Data Mining Methodologies and Pipelines for Crop Scanning.

Plants (Basel). 2024-9-19

[5]
Mutation Rate and Effective Population Size of the Model Cooperative Bacterium Myxococcus xanthus.

Genome Biol Evol. 2024-5-2

[6]
Short tandem repeat mutations regulate gene expression in colorectal cancer.

Sci Rep. 2024-2-9

[7]
Chromosomal scale assembly reveals localized structural variants in avian caecal coccidian parasite Eimeria tenella.

Sci Rep. 2023-12-20

[8]
Reference quality genome sequence of Indian pomegranate cv. 'Bhagawa' ( L.).

Front Plant Sci. 2022-9-15

[9]
Microsatellite Variation in the Most Devastating Beetle Pests (Coleoptera: Curculionidae) of Agricultural and Forest Crops.

Int J Mol Sci. 2022-8-30

[10]
BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data.

Front Big Data. 2022-1-18

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索