• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

神经优化:一种基于比对矩阵构建和正交双向门控循环单元网络的新型纳米孔优化方法。

NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks.

作者信息

Huang Neng, Nie Fan, Ni Peng, Luo Feng, Gao Xin, Wang Jianxin

机构信息

School of Computer Science and Engineering, Central South University, Changsha 410083, China.

Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China.

出版信息

Bioinformatics. 2021 Oct 11;37(19):3120-3127. doi: 10.1093/bioinformatics/btab354.

DOI:10.1093/bioinformatics/btab354
PMID:33973998
Abstract

MOTIVATION

Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the accuracy of genome analysis. Polishing is a procedure to correct the errors in genome assembly and can improve the reliability of the downstream analysis. However, the performances of the existing polishing methods are still not satisfactory.

RESULTS

We developed a novel polishing method, NeuralPolish, to correct the errors in assemblies based on alignment matrix construction and orthogonal Bi-GRU networks. In this method, we designed an alignment feature matrix for representing read-to-assembly alignment. Each row of the matrix represents a read, and each column represents the aligned bases at each position of the contig. In the network architecture, a bi-directional GRU network is used to extract the sequence information inside each read by processing the alignment matrix row by row. After that, the feature matrix is processed by another bi-directional GRU network column by column to calculate the probability distribution. Finally, a CTC decoder generates a polished sequence with a greedy algorithm. We used five real datasets and three assembly tools including Wtdbg2, Flye and Canu for testing, and compared the results of different polishing methods including NeuralPolish, Racon, MarginPolish, HELEN and Medaka. Comprehensive experiments demonstrate that NeuralPolish achieves more accurate assembly with fewer errors than other polishing methods and can improve the accuracy of assembly obtained by different assemblers.

AVAILABILITY AND IMPLEMENTATION

https://github.com/huangnengCSU/NeuralPolish.git.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

牛津纳米孔测序以低成本产生长读长,在基因组学研究中取得了许多突破。然而,纳米孔基因组组装中存在的大量错误影响了基因组分析的准确性。优化是一种纠正基因组组装错误的过程,可以提高下游分析的可靠性。然而,现有优化方法的性能仍不尽人意。

结果

我们开发了一种新的优化方法NeuralPolish,基于比对矩阵构建和正交双向门控循环单元(Bi-GRU)网络来纠正组装中的错误。在这种方法中,我们设计了一个比对特征矩阵来表示 reads 与组装序列的比对。矩阵的每一行代表一个 read,每一列代表重叠群(contig)每个位置上的比对碱基。在网络架构中,双向门控循环单元网络用于通过逐行处理比对矩阵来提取每个 read 内部的序列信息。之后,特征矩阵由另一个双向门控循环单元网络逐列处理以计算概率分布。最后,一个连接时序分类(CTC)解码器使用贪婪算法生成优化后的序列。我们使用了五个真实数据集和包括Wtdbg2、Flye和Canu在内的三种组装工具进行测试,并比较了不同优化方法(包括NeuralPolish、Racon、MarginPolish、HELEN和Medaka)的结果。综合实验表明,NeuralPolish比其他优化方法能以更少的错误实现更准确的组装,并且可以提高不同组装器获得的组装准确性。

可用性和实现方式

https://github.com/huangnengCSU/NeuralPolish.git。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks.神经优化:一种基于比对矩阵构建和正交双向门控循环单元网络的新型纳米孔优化方法。
Bioinformatics. 2021 Oct 11;37(19):3120-3127. doi: 10.1093/bioinformatics/btab354.
2
BlockPolish: accurate polishing of long-read assembly via block divide-and-conquer.BlockPolish:通过块划分与征服实现长读序列组装的精确抛光。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab405.
3
Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm.阿波罗:一种与测序技术无关、可扩展且准确的组装后处理算法。
Bioinformatics. 2020 Jun 1;36(12):3669-3679. doi: 10.1093/bioinformatics/btaa179.
4
Benchmarking short and long read polishing tools for nanopore assemblies: achieving near-perfect genomes for outbreak isolates.针对纳米孔组装的短读和长读抛光工具进行基准测试:实现暴发分离株的近乎完美基因组。
BMC Genomics. 2024 Jul 8;25(1):679. doi: 10.1186/s12864-024-10582-x.
5
The impact of applying various de novo assembly and correction tools on the identification of genome characterization, drug resistance, and virulence factors of clinical isolates using ONT sequencing.应用不同从头组装和校正工具对基于 ONT 测序的临床分离株基因组特征、耐药性和毒力因子鉴定的影响。
BMC Biotechnol. 2023 Jul 31;23(1):26. doi: 10.1186/s12896-023-00797-3.
6
SACall: A Neural Network Basecaller for Oxford Nanopore Sequencing Data Based on Self-Attention Mechanism.SACall:基于自注意力机制的牛津纳米孔测序数据的神经网络碱基调用程序。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):614-623. doi: 10.1109/TCBB.2020.3039244. Epub 2022 Feb 3.
7
Benchmarking of long-read sequencing, assemblers and polishers for yeast genome.酵母基因组长读测序、组装和精修的基准测试。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac146.
8
An exploration of assembly strategies and quality metrics on the accuracy of the rewarewa (Knightia excelsa) genome.探索 rewarewa(Knightia excelsa)基因组组装策略和质量指标对准确性的影响。
Mol Ecol Resour. 2021 Aug;21(6):2125-2144. doi: 10.1111/1755-0998.13406. Epub 2021 Jun 19.
9
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
10
JASPER: A fast genome polishing tool that improves accuracy of genome assemblies.JASPER:一种快速的基因组精修工具,可提高基因组组装的准确性。
PLoS Comput Biol. 2023 Mar 31;19(3):e1011032. doi: 10.1371/journal.pcbi.1011032. eCollection 2023 Mar.

引用本文的文献

1
Effect of a Gluten-Free Diet on the Intestinal Microbiota of Women with Celiac Disease.无麸质饮食对乳糜泻女性肠道微生物群的影响。
Antibiotics (Basel). 2025 Aug 2;14(8):785. doi: 10.3390/antibiotics14080785.
2
GoldPolish-target: targeted long-read genome assembly polishing.GoldPolish目标:靶向长读长基因组组装优化
BMC Bioinformatics. 2025 Mar 7;26(1):78. doi: 10.1186/s12859-025-06091-7.
3
Simple, reference-independent assessment to empirically guide correction and polishing of hybrid microbial community metagenomic assembly.
简单、无需参考的评估方法,可实际指导混合微生物群落宏基因组组装的纠错和优化。
PeerJ. 2024 Nov 8;12:e18132. doi: 10.7717/peerj.18132. eCollection 2024.
4
Variability of plant transcriptomic responses under stress acclimation: a review from high throughput studies.胁迫适应下植物转录组响应的变异性:来自高通量研究的综述。
Acta Biochim Pol. 2024 Oct 25;71:13585. doi: 10.3389/abp.2024.13585. eCollection 2024.
5
Upcoming progress of transcriptomics studies on plants: An overview.植物转录组学研究的未来进展:综述
Front Plant Sci. 2022 Dec 15;13:1030890. doi: 10.3389/fpls.2022.1030890. eCollection 2022.
6
Nanopore sequencing technology, bioinformatics and applications.纳米孔测序技术、生物信息学及其应用。
Nat Biotechnol. 2021 Nov;39(11):1348-1365. doi: 10.1038/s41587-021-01108-x. Epub 2021 Nov 8.
7
Comparative evaluation of Nanopore polishing tools for microbial genome assembly and polishing strategies for downstream analysis.比较评价纳米孔抛光工具在微生物基因组组装中的应用和下游分析的抛光策略。
Sci Rep. 2021 Oct 20;11(1):20740. doi: 10.1038/s41598-021-00178-w.