文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

KmerGO: A Tool to Identify Group-Specific Sequences With -mers.

作者信息

Wang Ying, Chen Qi, Deng Chao, Zheng Yiluan, Sun Fengzhu

机构信息

Department of Automation, Xiamen University, Xiamen, China.

Xiamen Key Laboratory of Big Data Intelligent Analysis and Decision-Making, Xiamen, China.

出版信息

Front Microbiol. 2020 Aug 25;11:2067. doi: 10.3389/fmicb.2020.02067. eCollection 2020.


DOI:10.3389/fmicb.2020.02067
PMID:32983048
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7477287/
Abstract

Capturing group-specific sequences between two groups of genomic/metagenomic sequences is critical for the follow-up identifications of singular nucleotide variants (SNVs), gene families, microbial species or other elements associated with each group. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered a "group-specific" sequence in our study. We developed a user-friendly tool, KmerGO, to identify group-specific sequences between two groups of genomic/metagenomic long sequences or high-throughput sequencing datasets. Compared with other tools, KmerGO captures group-specific -mers ( up to 40 bps) with much lower requirements for computing resources in much shorter running time. For a 1.05 TB dataset (.fasta), it takes KmerGO about 21.5 h (including -mer counting) to return assembled group-specific sequences on a regular stand-alone workstation with no more than 1 GB memory. Furthermore, KmerGO can also be applied to capture trait-associated sequences for continuous-trait. Through multi-process parallel computing, KmerGO is implemented with both graphic user interface and command line on Linux and Windows free from any pre-installed supporting environments, packages, and complex configurations. The output group-specific -mers or sequences from KmerGO could be the inputs of other tools for the downstream discovery of biomarkers, such as genetic variants, species, or genes. KmerGO is available at https://github.com/ChnMasterOG/KmerGO.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/e9051c53f088/fmicb-11-02067-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/75bd0f5ea971/fmicb-11-02067-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/c1c5894a3ece/fmicb-11-02067-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/ac8cffe5e92d/fmicb-11-02067-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/e9051c53f088/fmicb-11-02067-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/75bd0f5ea971/fmicb-11-02067-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/c1c5894a3ece/fmicb-11-02067-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/ac8cffe5e92d/fmicb-11-02067-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9f7/7477287/e9051c53f088/fmicb-11-02067-g004.jpg

相似文献

[1]
KmerGO: A Tool to Identify Group-Specific Sequences With -mers.

Front Microbiol. 2020-8-25

[2]
Identifying Sequences for Microbial Communities Using Long -mer Sequence Signatures.

Front Microbiol. 2018-5-3

[3]
Kmerind: A Flexible Parallel Library for K-mer Indexing of Biological Sequences on Distributed Memory Systems.

IEEE/ACM Trans Comput Biol Bioinform. 2017-10-9

[4]
KAnalyze: a fast versatile pipelined k-mer toolkit.

Bioinformatics. 2014-3-18

[5]
MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs.

BMC Bioinformatics. 2017-10-16

[6]
SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation.

PLoS One. 2016-10-5

[7]
Analysis of common k-mers for whole genome sequences using SSB-tree.

Genome Inform. 2002

[8]
Estimating the total genome length of a metagenomic sample using k-mers.

BMC Genomics. 2019-4-4

[9]
Squeakr: an exact and approximate k-mer counting system.

Bioinformatics. 2018-2-15

[10]
A k-mer-based method for the identification of phenotype-associated genomic biomarkers and predicting phenotypes of sequenced bacteria.

PLoS Comput Biol. 2018-10-22

引用本文的文献

[1]
Deep learning neural network development for the classification of bacteriocin sequences produced by lactic acid bacteria.

F1000Res. 2025-6-20

[2]
The fourspine stickleback (Apeltes quadracus) has an XY sex chromosome system with polymorphic inversions on both X and Y chromosomes.

PLoS Genet. 2025-5-9

[3]
Spiral phyllotaxis predicts left-right asymmetric growth and style deflection in mirror-image flowers of Cyanella alba.

Nat Commun. 2025-4-18

[4]
Inferring Staphylococcus aureus host species and cross-species transmission from a genome-based model.

BMC Genomics. 2025-2-17

[5]
A survey of k-mer methods and applications in bioinformatics.

Comput Struct Biotechnol J. 2024-5-21

[6]
Comparison of k-mer-based comparative metagenomic tools and approaches.

Microbiome Res Rep. 2023-7-20

[7]
-mer-Based Genome-Wide Association Studies in Plants: Advances, Challenges, and Perspectives.

Genes (Basel). 2023-7-13

[8]
Identifying individual-specific microbial DNA fingerprints from skin microbiomes.

Front Microbiol. 2022-10-6

[9]
The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms.

F1000Res. 2022

[10]
Hierarchical Microbial Functions Prediction by Graph Aggregated Embedding.

Front Genet. 2021-1-18

本文引用的文献

[1]
Reads Binning Improves Alignment-Free Metagenome Comparison.

Front Genet. 2019-11-21

[2]
Kevlar: A Mapping-Free Framework for Accurate Discovery of De Novo Variants.

iScience. 2019-8-30

[3]
Skmer: assembly-free and alignment-free sample identification using genome skims.

Genome Biol. 2019-2-13

[4]
A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic events.

PLoS Genet. 2018-11-12

[5]
Kmer-db: instant evolutionary distance estimation.

Bioinformatics. 2019-1-1

[6]
Association mapping from sequencing reads using -mers.

Elife. 2018-6-13

[7]
Identifying Sequences for Microbial Communities Using Long -mer Sequence Signatures.

Front Microbiol. 2018-5-3

[8]
A concurrent subtractive assembly approach for identification of disease associated sub-metagenomes.

Res Comput Mol Biol. 2017

[9]
KMC 3: counting and manipulating k-mer statistics.

Bioinformatics. 2017-9-1

[10]
Alignment-free Transcriptomic and Metatranscriptomic Comparison Using Sequencing Signatures with Variable Length Markov Chains.

Sci Rep. 2016-11-23

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索