Suppr超能文献

MAFcounter:一种用于统计MAF文件中k-mer出现次数的高效工具。

MAFcounter: an efficient tool for counting the occurrences of k-mers in MAF files.

作者信息

Patsakis Michail, Provatas Kimonas, Karatzikos Aris, Koilakos Charalampos, Mouratidis Ioannis, Georgakopoulos-Soares Ilias

机构信息

Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, The Pennsylvania State University College of Medicine, Hershey, PA, USA.

Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA.

出版信息

BMC Bioinformatics. 2025 May 30;26(1):142. doi: 10.1186/s12859-025-06172-7.

Abstract

MOTIVATION

With the rapid expansion of large-scale biological datasets, DNA and protein sequence alignments have become essential for comparative genomics and proteomics. These alignments facilitate the exploration of sequence similarity patterns, providing valuable insights into sequence conservation, evolutionary relationships and for functional analyses. Typically, sequence alignments are stored in formats such as the Multiple Alignment Format (MAF). Counting k-mer occurrences is a crucial task in many computational biology applications, but currently, there is no algorithm designed for k-mer counting in alignment files.

RESULTS

We have developed MAFcounter, the first k-mer counter dedicated to alignment files. MAFcounter is multithreaded, fast, and memory efficient, enabling k-mer counting in DNA and protein sequence alignment files with a wide variety of features for k-mer analysis.

AVAILABILITY

MAFcounter is released under GPL license as a suite of binary C++ applications and is available at: https://github.com/Georgakopoulos-Soares-lab/MAFcounter .

摘要

动机

随着大规模生物数据集的迅速扩展,DNA和蛋白质序列比对已成为比较基因组学和蛋白质组学的核心任务。这些比对有助于探索序列相似性模式,为序列保守性、进化关系及功能分析提供有价值的见解。通常,序列比对以多重比对格式(MAF)等形式存储。在许多计算生物学应用中,统计k-mer出现次数是一项关键任务,但目前尚无专门针对比对文件进行k-mer计数的算法。

结果

我们开发了MAFcounter,这是首个专门用于比对文件的k-mer计数器。MAFcounter是多线程的,速度快且内存效率高,能够对具有多种k-mer分析功能的DNA和蛋白质序列比对文件进行k-mer计数。

可用性

MAFcounter根据GPL许可作为一组二进制C++应用程序发布,可在以下网址获取:https://github.com/Georgakopoulos-Soares-lab/MAFcounter

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/067b/12125892/4f8dc06d908a/12859_2025_6172_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验