• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度单倍型网络:一种基于RetNet和深度谱聚类的单倍型组装方法。

DeepHapNet: a haplotype assembly method based on RetNet and deep spectral clustering.

作者信息

Luo Junwei, Wang Jiaojiao, Wei Jingjing, Yan Chaokun, Luo Huimin

机构信息

School of Software, Henan Polytechnic University, Century Road 2001, Jiaozuo 454003, China.

College of Chemical and Environmental Engineering, Anyang Institute of Technology, West Section of Huanghe Avenue, Anyang 455000, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae656.

DOI:10.1093/bib/bbae656
PMID:39690881
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11652615/
Abstract

Gene polymorphism originates from single-nucleotide polymorphisms (SNPs), and the analysis and study of SNPs are of great significance in the field of biogenetics. The haplotype, which consists of the sequence of SNP loci, carries more genetic information than a single SNP. Haplotype assembly plays a significant role in understanding gene function, diagnosing complex diseases, and pinpointing species genes. We propose a novel method, DeepHapNet, for haplotype assembly through the clustering of reads and learning correlations between read pairs. We employ a sequence model called Retentive Network (RetNet), which utilizes a multiscale retention mechanism to extract read features and learn the global relationships among them. Based on the feature representation of reads learned from the RetNet model, the clustering process of reads is implemented using the SpectralNet model, and, finally, haplotypes are constructed based on the read clusters. Experiments with simulated and real datasets show that the method performs well in the haplotype assembly problem of diploid and polyploid based on either long or short reads. The code implementation of DeepHapNet and the processing scripts for experimental data are publicly available at https://github.com/wjj6666/DeepHapNet.

摘要

基因多态性源于单核苷酸多态性(SNP),SNP的分析与研究在生物遗传学领域具有重要意义。由SNP位点序列组成的单倍型携带的遗传信息比单个SNP更多。单倍型组装在理解基因功能、诊断复杂疾病以及确定物种基因方面发挥着重要作用。我们提出了一种名为DeepHapNet的新方法,通过对 reads 进行聚类并学习 read 对之间的相关性来进行单倍型组装。我们采用一种名为Retention Network(RetNet)的序列模型,该模型利用多尺度保留机制来提取 read 特征并学习它们之间的全局关系。基于从RetNet模型中学习到的reads的特征表示,使用SpectralNet模型实现reads的聚类过程,最后根据read聚类构建单倍型。对模拟数据集和真实数据集的实验表明,该方法在基于长reads或短reads的二倍体和多倍体单倍型组装问题中表现良好。DeepHapNet的代码实现和实验数据的处理脚本可在https://github.com/wjj6666/DeepHapNet上公开获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1aed/11652615/91904fea4e22/bbae656f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1aed/11652615/91904fea4e22/bbae656f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1aed/11652615/91904fea4e22/bbae656f1.jpg

相似文献

1
DeepHapNet: a haplotype assembly method based on RetNet and deep spectral clustering.深度单倍型网络:一种基于RetNet和深度谱聚类的单倍型组装方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae656.
2
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.利用跨越多个单核苷酸多态性的读取信息,从测序数据中推断单倍型。
Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3.
3
GenHap: a novel computational method based on genetic algorithms for haplotype assembly.GenHap:一种基于遗传算法的新型单倍型组装计算方法。
BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):172. doi: 10.1186/s12859-019-2691-y.
4
NanoSNP: a progressive and haplotype-aware SNP caller on low-coverage nanopore sequencing data.NanoSNP:一种针对低覆盖度纳米孔测序数据的渐进式、单体型感知 SNP 调用程序。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac824.
5
Progressive approach for SNP calling and haplotype assembly using single molecular sequencing data.使用单分子测序数据进行 SNP 调用和单倍型组装的渐进方法。
Bioinformatics. 2018 Jun 15;34(12):2012-2018. doi: 10.1093/bioinformatics/bty059.
6
XHap: haplotype assembly using long-distance read correlations learned by transformers.XHap:利用通过变压器学习的长距离读段相关性进行单倍型组装。
Bioinform Adv. 2023 Nov 23;3(1):vbad169. doi: 10.1093/bioadv/vbad169. eCollection 2023.
7
Joint haplotype assembly and genotype calling via sequential Monte Carlo algorithm.通过序贯蒙特卡罗算法进行联合单倍型组装和基因型分型
BMC Bioinformatics. 2015 Jul 16;16:223. doi: 10.1186/s12859-015-0651-8.
8
The linkage method: a novel approach for SNP detection and haplotype reconstruction from a single diploid individual using next-generation sequence data.连锁分析法:一种利用新一代测序数据从单个二倍体个体中检测 SNP 和重建单体型的新方法。
Mol Biol Evol. 2013 Sep;30(9):2187-96. doi: 10.1093/molbev/mst103. Epub 2013 May 31.
9
HaploMaker: An improved algorithm for rapid haplotype assembly of genomic sequences.HaploMaker:一种用于快速组装基因组序列单倍型的改进算法。
Gigascience. 2022 May 17;11. doi: 10.1093/gigascience/giac038.
10
HapCUT2: A Method for Phasing Genomes Using Experimental Sequence Data.HapCUT2:一种使用实验序列数据进行基因组相位分析的方法。
Methods Mol Biol. 2023;2590:139-147. doi: 10.1007/978-1-0716-2819-5_9.

本文引用的文献

1
Targeted Linked-Read Sequencing for Direct Haplotype Phasing of Parental GJB2/SLC26A4 Alleles: A Universal and Dependable Noninvasive Prenatal Diagnosis Method Applied to Autosomal Recessive Nonsyndromic Hearing Loss in At-Risk Families.靶向连接读取测序技术用于直接单倍型分析父源 GJB2/SLC26A4 等位基因:应用于风险家庭常染色体隐性非综合征性听力损失的通用且可靠的无创性产前诊断方法。
J Mol Diagn. 2024 Jul;26(7):638-651. doi: 10.1016/j.jmoldx.2024.04.002. Epub 2024 Apr 23.
2
XHap: haplotype assembly using long-distance read correlations learned by transformers.XHap:利用通过变压器学习的长距离读段相关性进行单倍型组装。
Bioinform Adv. 2023 Nov 23;3(1):vbad169. doi: 10.1093/bioadv/vbad169. eCollection 2023.
3
Genomic prediction with haplotype blocks in wheat.
利用单倍型块对小麦进行基因组预测。
Front Plant Sci. 2023 May 9;14:1168547. doi: 10.3389/fpls.2023.1168547. eCollection 2023.
4
High-quality haplotype-resolved genome assembly of cultivated octoploid strawberry.栽培八倍体草莓的高质量单倍型解析基因组组装
Hortic Res. 2023 Jan 4;10(1):uhad002. doi: 10.1093/hr/uhad002. eCollection 2023 Jan.
5
Benchmarking challenging small variants with linked and long reads.使用连锁读段和长读段对具有挑战性的小变异进行基准测试。
Cell Genom. 2022 May;2(5). doi: 10.1016/j.xgen.2022.100128.
6
Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar.四倍体马铃薯品种的染色体水平和单倍型分辨率基因组组装。
Nat Genet. 2022 Mar;54(3):342-348. doi: 10.1038/s41588-022-01015-0. Epub 2022 Mar 3.
7
Two divergent haplotypes from a highly heterozygous lychee genome suggest independent domestication events for early and late-maturing cultivars.两个来自高度杂合荔枝基因组的分歧单倍型表明,早、晚熟品种的独立驯化事件。
Nat Genet. 2022 Jan;54(1):73-83. doi: 10.1038/s41588-021-00971-3. Epub 2022 Jan 3.
8
phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.相位图:基于长读长的二倍体基因组单体型感知从头组装
Genome Biol. 2021 Oct 27;22(1):299. doi: 10.1186/s13059-021-02512-x.
9
Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.使用带有 hifiasm 的相定装配图进行单体型解析从头组装。
Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Epub 2021 Feb 1.
10
Chromosome-scale, haplotype-resolved assembly of human genomes.人类基因组的染色体规模、单倍型解析组装。
Nat Biotechnol. 2021 Mar;39(3):309-312. doi: 10.1038/s41587-020-0711-0. Epub 2020 Dec 7.