• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Effloc:一种基于FM索引的大规模出现生物模式的高效定位算法。

Effloc: An Efficient Locating Algorithm for Mass-Occurrence Biological Patterns with FM-Index.

作者信息

Guo Li-Lu

机构信息

Department of Computer Science, Xidian University, Xi'an, Shaanxi, China.

出版信息

J Comput Biol. 2025 Sep;32(9):865-878. doi: 10.1089/cmb.2024.0925. Epub 2025 May 2.

DOI:10.1089/cmb.2024.0925
PMID:40314133
Abstract

Pattern locating is a crucial step in various biological sequence analysis tasks. As a compressed full-text indexing technology, full-text minute-space index has been introduced for biological pattern locating over ultra-long genomes, with a low memory footprint and retrieving time independent of genome size. However, its locating time is limited by the number of occurrences of the biological pattern in the genome, and it is not efficient enough when dealing with mass-occurrence biological patterns. To solve this problem, we propose an efficient locating algorithm for mass-occurrence biological patterns in genomic sequence, namely Effloc. It is developed on two optimization techniques. One is that rankings with the same Burrows-Wheeler Transform character are organized into a group and calculated together, thereby reducing the number of last-to-first column () mapping operations required to jump forward to find suffix array (SA) sampling points; the other is to design a specific structure to record the jump status, thus avoiding the redundant mapping operations that exist in the process of finding SA sampling points for those adjacent patterns that share the same sampling point. Compared with the existing algorithm, Effloc can significantly reduce the number of time-consuming mapping operations in mass-occurrence pattern locating. Ablation experiments verified our algorithm's effectiveness, exhibiting faster locating speed compared with five state-of-the-art competing algorithms. The source code and data are released at https://github.com/Lilu-guo/Effloc.

摘要

模式定位是各种生物序列分析任务中的关键步骤。作为一种压缩全文索引技术,全文微空间索引已被引入用于超长基因组的生物模式定位,具有低内存占用且检索时间与基因组大小无关的特点。然而,其定位时间受基因组中生物模式出现次数的限制,在处理大量出现的生物模式时效率不够高。为了解决这个问题,我们提出了一种用于基因组序列中大量出现的生物模式的高效定位算法,即Effloc。它基于两种优化技术开发。一种是将具有相同Burrows-Wheeler变换字符的排名组织成一组并一起计算,从而减少向前跳跃以找到后缀数组(SA)采样点所需的从后到前列()映射操作的数量;另一种是设计一种特定结构来记录跳跃状态,从而避免在为那些共享相同采样点的相邻模式寻找SA采样点的过程中存在的冗余映射操作。与现有算法相比,Effloc可以显著减少大量出现模式定位中耗时的映射操作数量。消融实验验证了我们算法的有效性,与五种最先进的竞争算法相比,其定位速度更快。源代码和数据可在https://github.com/Lilu-guo/Effloc上获取。

相似文献

1
Effloc: An Efficient Locating Algorithm for Mass-Occurrence Biological Patterns with FM-Index.Effloc:一种基于FM索引的大规模出现生物模式的高效定位算法。
J Comput Biol. 2025 Sep;32(9):865-878. doi: 10.1089/cmb.2024.0925. Epub 2025 May 2.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Short-Term Memory Impairment短期记忆障碍
4
Aspects of Genetic Diversity, Host Specificity and Public Health Significance of Single-Celled Intestinal Parasites Commonly Observed in Humans and Mostly Referred to as 'Non-Pathogenic'.人类常见且大多被称为“非致病性”的单细胞肠道寄生虫的遗传多样性、宿主特异性及公共卫生意义
APMIS. 2025 Sep;133(9):e70036. doi: 10.1111/apm.70036.
5
Anterior Approach Total Ankle Arthroplasty with Patient-Specific Cut Guides.使用患者特异性截骨导向器的前路全踝关节置换术。
JBJS Essent Surg Tech. 2025 Aug 15;15(3). doi: 10.2106/JBJS.ST.23.00027. eCollection 2025 Jul-Sep.
6
Sexual Harassment and Prevention Training性骚扰与预防培训
7
Automated devices for identifying peripheral arterial disease in people with leg ulceration: an evidence synthesis and cost-effectiveness analysis.用于识别下肢溃疡患者外周动脉疾病的自动化设备:证据综合和成本效益分析。
Health Technol Assess. 2024 Aug;28(37):1-158. doi: 10.3310/TWCG3912.
8
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
9
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
10
dwMLCS: An Efficient MLCS Algorithm Based on Dynamic and Weighted Directed Acyclic Graph.dwMLCS:一种基于动态加权有向无环图的高效最大公共子序列算法。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1987-1999. doi: 10.1109/TCBB.2024.3431558. Epub 2024 Dec 10.