Suppr超能文献

与……进行大规模序列比较

Large-scale sequence comparisons with .

作者信息

Pierce N Tessa, Irber Luiz, Reiter Taylor, Brooks Phillip, Brown C Titus

机构信息

Department of Population Health and Reproduction, University of California, Davis, Davis, California, 95616, USA.

出版信息

F1000Res. 2019 Jul 4;8:1006. doi: 10.12688/f1000research.19675.1. eCollection 2019.

Abstract

The sourmash software package uses MinHash-based sketching to create "signatures", compressed representations of DNA, RNA, and protein sequences, that can be stored, searched, explored, and taxonomically annotated. sourmash signatures can be used to estimate sequence similarity between very large data sets quickly and in low memory, and can be used to search large databases of genomes for matches to query genomes and metagenomes. sourmash is implemented in C++, Rust, and Python, and is freely available under the BSD license at http://github.com/dib-lab/sourmash.

摘要

sourmash软件包使用基于MinHash的草图绘制来创建“签名”,即DNA、RNA和蛋白质序列的压缩表示形式,这些“签名”可以存储、搜索、探索并进行分类注释。sourmash签名可用于快速且在低内存条件下估计非常大的数据集之间的序列相似性,还可用于在大型基因组数据库中搜索与查询基因组和宏基因组相匹配的序列。sourmash用C++、Rust和Python实现,可在BSD许可下从http://github.com/dib-lab/sourmash免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/79c6/6720031/4b7b9071f392/f1000research-8-21579-g0000.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验