Suppr超能文献

MAFcounter:一种用于统计MAF文件中k-mer出现次数的高效工具。

MAFcounter: An efficient tool for counting the occurrences of k-mers in MAF files.

作者信息

Patsakis Michail, Provatas Kimonas, Mouratidis Ioannis, Georgakopoulos-Soares Ilias

机构信息

Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA.

Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA.

出版信息

ArXiv. 2024 Nov 29:arXiv:2411.19427v1.

Abstract

MOTIVATION

With the rapid expansion of large-scale biological datasets, DNA and protein sequence alignments have become essential for comparative genomics and proteomics. These alignments facilitate the exploration of sequence similarity patterns, providing valuable insights into sequence conservation, evolutionary relationships and for functional analyses. Typically, sequence alignments are stored in formats such as the Multiple Alignment Format (MAF). Counting k-mer occurrences is a crucial task in many computational biology applications, but currently, there is no algorithm designed for k-mer counting in alignment files.

RESULTS

We have developed MAFcounter, the first k-mer counter dedicated to alignment files. MAFcounter is multithreaded, fast, and memory efficient, enabling k-mer counting in DNA and protein sequence alignment files.

AVAILABILITY

The MAFcounter package and its Python bindings are released under GPL license as a multi-platform application and are available at: https://github.com/Georgakopoulos-Soares-lab/MAFcounter.

摘要

动机

随着大规模生物数据集的迅速扩展,DNA和蛋白质序列比对已成为比较基因组学和蛋白质组学的核心。这些比对有助于探索序列相似性模式,为序列保守性、进化关系及功能分析提供宝贵见解。通常,序列比对以多种比对格式(MAF)等形式存储。在许多计算生物学应用中,统计k-mer出现次数是一项关键任务,但目前尚无专门针对比对文件进行k-mer计数的算法。

结果

我们开发了MAFcounter,这是首个专门用于比对文件的k-mer计数器。MAFcounter是多线程的,速度快且内存效率高,能够对DNA和蛋白质序列比对文件进行k-mer计数。

可用性

MAFcounter软件包及其Python绑定以GPL许可作为多平台应用发布,可从以下网址获取:https://github.com/Georgakopoulos-Soares-lab/MAFcounter

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验