• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Cliffy: robust 16S rRNA classification based on a compressed LCA index.Cliffy:基于压缩的最低公共祖先(LCA)索引的稳健16S rRNA分类。
bioRxiv. 2024 May 30:2024.05.25.595899. doi: 10.1101/2024.05.25.595899.
2
Robust 16S rRNA classification based on a compressed LCA index.基于压缩最近共同祖先(LCA)索引的稳健16S rRNA分类
Genome Res. 2025 Aug 25. doi: 10.1101/gr.279846.124.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Movi Color: fast and accurate long-read classification with the move structure.Movi Color:利用移动结构进行快速准确的长读长分类。
bioRxiv. 2025 May 27:2025.05.22.655637. doi: 10.1101/2025.05.22.655637.
5
Aspects of Genetic Diversity, Host Specificity and Public Health Significance of Single-Celled Intestinal Parasites Commonly Observed in Humans and Mostly Referred to as 'Non-Pathogenic'.人类常见且大多被称为“非致病性”的单细胞肠道寄生虫的遗传多样性、宿主特异性及公共卫生意义
APMIS. 2025 Sep;133(9):e70036. doi: 10.1111/apm.70036.
6
Sexual Harassment and Prevention Training性骚扰与预防培训
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Run-length compressed metagenomic read classification with SMEM-finding and tagging.基于SMEM查找和标记的游程长度压缩宏基因组读取分类
bioRxiv. 2025 Mar 24:2025.02.25.640119. doi: 10.1101/2025.02.25.640119.
9
Healthcare workers' informal uses of mobile phones and other mobile devices to support their work: a qualitative evidence synthesis.医护人员非正规使用手机和其他移动设备来支持工作:定性证据综合评价。
Cochrane Database Syst Rev. 2024 Aug 27;8(8):CD015705. doi: 10.1002/14651858.CD015705.pub2.
10
Automated devices for identifying peripheral arterial disease in people with leg ulceration: an evidence synthesis and cost-effectiveness analysis.用于识别下肢溃疡患者外周动脉疾病的自动化设备:证据综合和成本效益分析。
Health Technol Assess. 2024 Aug;28(37):1-158. doi: 10.3310/TWCG3912.

Cliffy:基于压缩的最低公共祖先(LCA)索引的稳健16S rRNA分类。

Cliffy: robust 16S rRNA classification based on a compressed LCA index.

作者信息

Ahmed Omar, Boucher Christina, Langmead Ben

出版信息

bioRxiv. 2024 May 30:2024.05.25.595899. doi: 10.1101/2024.05.25.595899.

DOI:10.1101/2024.05.25.595899
PMID:38854039
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11160684/
Abstract

UNLABELLED

Taxonomic sequence classification is a computational problem central to the study of metagenomics and evolution. Advances in compressed indexing with the -index enable full-text pattern matching against large sequence collections. But the data structures that link pattern sequences to their clades of origin still do not scale well to large collections. Previous work proposed the document array profiles, which use ( ) words of space where is the number of maximal-equal letter runs in the Burrows-Wheeler transform and is the number of distinct genomes. The linear dependence on is limiting, since real taxonomies can easily contain 10,000s of leaves or more. We propose a method called cliff compression that reduces this size by a large factor, over 250x when indexing the SILVA 16S rRNA gene database. This method uses Θ( log ) words of space in expectation under a random model we propose here. We implemented these ideas in an open source tool called Cliffy that performs efficient taxonomic classification of sequencing reads with respect to a compressed taxonomic index. When applied to simulated 16S rRNA reads, Cliffy's read-level accuracy is higher than Kraken2's by 11-18%. Clade abundances are also more accurately predicted by Cliffy compared to Kraken2 and Bracken. Overall, Cliffy is a fast and space-economical extension to compressed full-text indexes, enabling them to perform fast and accurate taxonomic classification queries.

2012 ACM SUBJECT CLASSIFICATION: Applied computing Computational genomics.

摘要

未标注

分类序列分类是宏基因组学和进化研究中的核心计算问题。使用 -索引的压缩索引技术进步使得能够对大型序列集合进行全文模式匹配。但是,将模式序列与其起源分支相联系的数据结构在处理大型集合时仍无法很好地扩展。先前的工作提出了文档数组概况,其使用 ( ) 个字的空间,其中 是布罗伊登-惠勒变换中最大相等字母游程的数量, 是不同基因组的数量。对 的线性依赖具有局限性,因为实际分类法很容易包含数以万计甚至更多的叶节点。我们提出了一种名为悬崖压缩的方法,该方法可将此大小大幅缩减,在对SILVA 16S rRNA基因数据库进行索引时缩减超过250倍。在此处提出的随机模型下,该方法预期使用 Θ( log ) 个字的空间。我们在一个名为Cliffy的开源工具中实现了这些想法,该工具可针对压缩分类索引对测序读数进行高效的分类。当应用于模拟的16S rRNA读数时,Cliffy的读数级准确率比Kraken2高11 - 18%。与Kraken2和Bracken相比,Cliffy对分支丰度的预测也更准确。总体而言,Cliffy是对压缩全文索引的快速且节省空间的扩展,使其能够执行快速且准确的分类查询。

2012年美国计算机协会主题分类:应用计算 计算基因组学。