INSnet：一种基于深度学习网络的插入检测方法。

INSnet: a method for detecting insertions based on deep learning network.

机构信息

School of Software, Henan Polytechnic University, Jiaozuo, 454003, China.

出版信息

BMC Bioinformatics. 2023 Mar 6;24(1):80. doi: 10.1186/s12859-023-05216-0.

DOI:10.1186/s12859-023-05216-0

PMID:36879189

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9990265/

Abstract

BACKGROUND

Many studies have shown that structural variations (SVs) strongly impact human disease. As a common type of SV, insertions are usually associated with genetic diseases. Therefore, accurately detecting insertions is of great significance. Although many methods for detecting insertions have been proposed, these methods often generate some errors and miss some variants. Hence, accurately detecting insertions remains a challenging task.

RESULTS

In this paper, we propose a method named INSnet to detect insertions using a deep learning network. First, INSnet divides the reference genome into continuous sub-regions and takes five features for each locus through alignments between long reads and the reference genome. Next, INSnet uses a depthwise separable convolutional network. The convolution operation extracts informative features through spatial information and channel information. INSnet uses two attention mechanisms, the convolutional block attention module (CBAM) and efficient channel attention (ECA) to extract key alignment features in each sub-region. In order to capture the relationship between adjacent subregions, INSnet uses a gated recurrent unit (GRU) network to further extract more important SV signatures. After predicting whether a sub-region contains an insertion through the previous steps, INSnet determines the precise site and length of the insertion. The source code is available from GitHub at https://github.com/eioyuou/INSnet .

CONCLUSION

Experimental results show that INSnet can achieve better performance than other methods in terms of F1 score on real datasets.

摘要

背景

许多研究表明，结构变异（SV）强烈影响人类疾病。作为一种常见的 SV 类型，插入通常与遗传疾病有关。因此，准确检测插入非常重要。尽管已经提出了许多用于检测插入的方法，但这些方法通常会产生一些错误并错过一些变体。因此，准确检测插入仍然是一项具有挑战性的任务。

结果

在本文中，我们提出了一种名为 INSnet 的方法，用于使用深度学习网络检测插入。首先，INSnet 将参考基因组划分为连续的子区域，并通过长读段与参考基因组之间的比对为每个基因座提取五个特征。接下来，INSnet 使用深度可分离卷积网络。卷积操作通过空间信息和通道信息提取有信息量的特征。INSnet 使用两种注意力机制，卷积块注意力模块（CBAM）和高效通道注意力（ECA），从每个子区域中提取关键对齐特征。为了捕获相邻子区域之间的关系，INSnet 使用门控循环单元（GRU）网络进一步提取更重要的 SV 特征。在通过前面的步骤预测一个子区域是否包含插入之后，INSnet确定插入的精确位置和长度。源代码可在 https://github.com/eioyuou/INSnet 上从 GitHub 获得。

结论

实验结果表明，INSnet 在真实数据集上的 F1 分数方面可以比其他方法取得更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9598/9990265/19aaf8bd24b0/12859_2023_5216_Fig1_HTML.jpg

相似文献

INSnet: a method for detecting insertions based on deep learning network.

BMC Bioinformatics. 2023 Mar 6;24(1):80. doi: 10.1186/s12859-023-05216-0.

LSnet: detecting and genotyping deletions using deep learning network.

Front Genet. 2023 Jun 14;14:1189775. doi: 10.3389/fgene.2023.1189775. eCollection 2023.

MAMnet: detecting and genotyping deletions and insertions based on long reads and a deep learning approach.

Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac195.

BreakNet: detecting deletions using long reads and a deep learning approach.

BMC Bioinformatics. 2021 Dec 2;22(1):577. doi: 10.1186/s12859-021-04499-5.

SVcnn: an accurate deep learning-based method for detecting structural variation based on long-read data.

BMC Bioinformatics. 2023 May 23;24(1):213. doi: 10.1186/s12859-023-05324-x.

Cue: a deep-learning framework for structural variant discovery and genotyping.

Nat Methods. 2023 Apr;20(4):559-568. doi: 10.1038/s41592-023-01799-x. Epub 2023 Mar 23.

cnnLSV: detecting structural variants by encoding long-read alignment information and convolutional neural network.

BMC Bioinformatics. 2023 Mar 28;24(1):119. doi: 10.1186/s12859-023-05243-x.

SIns: A Novel Insertion Detection Approach Based on Soft-Clipped Reads.

Front Genet. 2021 Apr 30;12:665812. doi: 10.3389/fgene.2021.665812. eCollection 2021.

A lightweight attention deep learning method for human-vehicle recognition based on wireless sensing technology.

Front Neurosci. 2023 Feb 9;17:1135986. doi: 10.3389/fnins.2023.1135986. eCollection 2023.

A double-channel multiscale depthwise separable convolutional neural network for abnormal gait recognition.

Math Biosci Eng. 2023 Feb 23;20(5):8049-8067. doi: 10.3934/mbe.2023349.

引用本文的文献

SVHunter: long-read-based structural variation detection through the transformer model.

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf203.

DconnLoop: a deep learning model for predicting chromatin loops based on multi-source data integration.

BMC Bioinformatics. 2025 Apr 1;26(1):96. doi: 10.1186/s12859-025-06092-6.

HiSVision: A Method for Detecting Large-Scale Structural Variations Based on Hi-C Data and Detection Transformer.

Interdiscip Sci. 2024 Dec 23. doi: 10.1007/s12539-024-00677-0.

GTasm: a genome assembly method using graph transformers and HiFi reads.

Front Genet. 2024 Oct 25;15:1495657. doi: 10.3389/fgene.2024.1495657. eCollection 2024.

SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies.

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae336.

LcDel: deletion variation detection based on clustering and long reads.

Front Genet. 2024 May 10;15:1404415. doi: 10.3389/fgene.2024.1404415. eCollection 2024.

Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data.

Nat Commun. 2024 Mar 19;15(1):2447. doi: 10.1038/s41467-024-46614-z.

本文引用的文献

Truvari: refined structural variant comparison preserves allelic diversity.

Genome Biol. 2022 Dec 27;23(1):271. doi: 10.1186/s13059-022-02840-6.

A deep learning method for repurposing antiviral drugs against new viruses via multi-view nonnegative matrix factorization and its application to SARS-CoV-2.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab526.

HINGRL: predicting drug-disease associations with graph representation learning on heterogeneous information networks.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab515.

SIns: A Novel Insertion Detection Approach Based on Soft-Clipped Reads.

Front Genet. 2021 Apr 30;12:665812. doi: 10.3389/fgene.2021.665812. eCollection 2021.

A survey on computational models for predicting protein-protein interactions.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab036.

HiSCF: leveraging higher-order structures for clustering analysis in biological networks.

Bioinformatics. 2021 May 1;37(4):542-550. doi: 10.1093/bioinformatics/btaa775.

Long-read-based human genomic structural variation detection with cuteSV.

Genome Biol. 2020 Aug 3;21(1):189. doi: 10.1186/s13059-020-02107-y.

A robust benchmark for detection of germline large deletions and insertions.

Nat Biotechnol. 2020 Nov;38(11):1347-1355. doi: 10.1038/s41587-020-0538-8. Epub 2020 Jun 15.

Opportunities and challenges in long-read sequencing data analysis.

Genome Biol. 2020 Feb 7;21(1):30. doi: 10.1186/s13059-020-1935-5.

Patterns of somatic structural variation in human cancer genomes.

Nature. 2020 Feb;578(7793):112-121. doi: 10.1038/s41586-019-1913-9. Epub 2020 Feb 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

INSnet：一种基于深度学习网络的插入检测方法。

INSnet: a method for detecting insertions based on deep learning network.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献