• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MEHunter:基于 Transformer 的长读段上移动元件变异检测。

MEHunter: transformer-based mobile element variant detection from long reads.

机构信息

Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.

Zhengzhou Research Institute, Harbin Institute of Technology, Zhengzhou, Henan 450000, China.

出版信息

Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae557.

DOI:10.1093/bioinformatics/btae557
PMID:39287014
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11415824/
Abstract

SUMMARY

Mobile genetic elements (MEs) are heritable mutagens that significantly contribute to genetic diseases. The advent of long-read sequencing technologies, capable of resolving large DNA fragments, offers promising prospects for the comprehensive detection of ME variants (MEVs). However, achieving high precision while maintaining recall performance remains challenging mainly brought by the variable length and similar content of MEV signatures, which are often obscured by the noise in long reads. Here, we propose MEHunter, a high-performance MEV detection approach utilizing a fine-tuned transformer model adept at identifying potential MEVs with fragmented features. Benchmark experiments on both simulated and real datasets demonstrate that MEHunter consistently achieves higher accuracy and sensitivity than the state-of-the-art tools. Furthermore, it is capable of detecting novel potentially individual-specific MEVs that have been overlooked in published population projects.

AVAILABILITY AND IMPLEMENTATION

MEHunter is available from https://github.com/120L021101/MEHunter.

摘要

摘要

移动遗传元件(MEs)是可遗传的诱变剂,它们对遗传疾病有很大的贡献。长读测序技术的出现,能够解析大的 DNA 片段,为 ME 变体(MEVs)的全面检测提供了有前景的方法。然而,主要由于 MEV 特征的可变长度和相似内容,实现高精度同时保持召回性能仍然具有挑战性,这些特征通常被长读序列中的噪声所掩盖。在这里,我们提出了 MEHunter,一种利用微调的转换器模型进行 MEV 检测的高性能方法,该模型擅长识别具有碎片化特征的潜在 MEVs。在模拟和真实数据集上的基准实验表明,MEHunter 始终比最先进的工具具有更高的准确性和灵敏度。此外,它还能够检测到在已发表的群体项目中被忽视的新的潜在个体特异性 MEVs。

可用性和实现

MEHunter 可从 https://github.com/120L021101/MEHunter 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fdf1/11415824/24ef7e8128f9/btae557f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fdf1/11415824/24ef7e8128f9/btae557f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fdf1/11415824/24ef7e8128f9/btae557f1.jpg

相似文献

1
MEHunter: transformer-based mobile element variant detection from long reads.MEHunter:基于 Transformer 的长读段上移动元件变异检测。
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae557.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.SAKit:一种用于鉴定由大尺度和小尺度变异事件产生的新型蛋白质的一体化分析管道。
J Bioinform Comput Biol. 2024 Oct;22(5):2450022. doi: 10.1142/S0219720024500227. Epub 2024 Oct 1.
4
De novo Genome Assembly Using Long Reads and Chromosome Conformation Capture.使用长读长和染色体构象捕获进行从头基因组组装
Methods Mol Biol. 2025;2935:1-27. doi: 10.1007/978-1-0716-4583-3_1.
5
GapReduce: a gap filling algorithm based on partitioned read sets.GapReduce:一种基于分区读集的缺口填充算法。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan 5. doi: 10.1109/TCBB.2018.2789909.
6
Efficient seeding for error-prone sequences with SubseqHash2.使用SubseqHash2对易错序列进行高效播种。
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf418.
7
Short-Term Memory Impairment短期记忆障碍
8
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
9
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
10
cONcat: Computational reconstruction of concatenated fragments from long Oxford Nanopore reads.cONcat:从长牛津纳米孔测序读段中进行串联片段的计算重建。
PLoS One. 2025 Jul 24;20(7):e0321246. doi: 10.1371/journal.pone.0321246. eCollection 2025.

引用本文的文献

1
cfMethylPre: deep transfer learning enhances cancer detection based on circulating cell-free DNA methylation profiling.cfMethylPre:深度迁移学习基于循环游离DNA甲基化谱分析增强癌症检测。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf303.

本文引用的文献

1
A 25-year odyssey of genomic technology advances and structural variant discovery.基因组技术进步和结构变异发现的 25 年探索历程。
Cell. 2024 Feb 29;187(5):1024-1037. doi: 10.1016/j.cell.2024.01.002. Epub 2024 Jan 29.
2
Mobile element variation contributes to population-specific genome diversification, gene regulation and disease risk.移动元件变异导致了特定种群的基因组多样化、基因调控和疾病风险。
Nat Genet. 2023 Jun;55(6):939-951. doi: 10.1038/s41588-023-01390-2. Epub 2023 May 11.
3
Deciphering the exact breakpoints of structural variations using long sequencing reads with DeBreak.
使用 DeBreak 对长测序reads 进行分析,以破译结构变异的精确断点。
Nat Commun. 2023 Jan 17;14(1):283. doi: 10.1038/s41467-023-35996-1.
4
Erratum to: abPOA: an SIMD-based C library for fast partial order alignment using adaptive band.勘误:abPOA:一个基于单指令多数据(SIMD)的C库,用于使用自适应条带进行快速偏序比对。
Bioinformatics. 2021 Oct 11;37(19):3384. doi: 10.1093/bioinformatics/btab587.
5
Comprehensive identification of transposable element insertions using multiple sequencing technologies.利用多种测序技术进行转座元件插入的综合鉴定。
Nat Commun. 2021 Jun 22;12(1):3836. doi: 10.1038/s41467-021-24041-8.
6
Haplotype-resolved diverse human genomes and integrated analysis of structural variation.单体型解析的多样化人类基因组和结构变异的综合分析。
Science. 2021 Apr 2;372(6537). doi: 10.1126/science.abf7117. Epub 2021 Feb 25.
7
Long-read-based human genomic structural variation detection with cuteSV.使用 cuteSV 进行基于长读长的人类基因组结构变异检测。
Genome Biol. 2020 Aug 3;21(1):189. doi: 10.1186/s13059-020-02107-y.
8
Identification and characterization of occult human-specific LINE-1 insertions using long-read sequencing technology.利用长读测序技术鉴定和表征隐匿性人类特异性 LINE-1 插入。
Nucleic Acids Res. 2020 Feb 20;48(3):1146-1163. doi: 10.1093/nar/gkz1173.
9
rMETL: sensitive mobile element insertion detection with long read realignment.rMETL:使用长读重-align 进行敏感的移动元件插入检测。
Bioinformatics. 2019 Sep 15;35(18):3484-3486. doi: 10.1093/bioinformatics/btz106.
10
Minimap2: pairwise alignment for nucleotide sequences.Minimap2:核苷酸序列的两两比对。
Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.