从传统到创新：基因组注释中的常规和深度学习框架。

From tradition to innovation: conventional and deep learning frameworks in genome annotation.

机构信息

National Key Laboratory for Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangzhou 518120, China.

College of Biomedical Engineering, Taiyuan University of Technology, Jinzhong 030600, China.

出版信息

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae138.

DOI:10.1093/bib/bbae138

PMID:38581418

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10998533/

Abstract

Following the milestone success of the Human Genome Project, the 'Encyclopedia of DNA Elements (ENCODE)' initiative was launched in 2003 to unearth information about the numerous functional elements within the genome. This endeavor coincided with the emergence of numerous novel technologies, accompanied by the provision of vast amounts of whole-genome sequences, high-throughput data such as ChIP-Seq and RNA-Seq. Extracting biologically meaningful information from this massive dataset has become a critical aspect of many recent studies, particularly in annotating and predicting the functions of unknown genes. The core idea behind genome annotation is to identify genes and various functional elements within the genome sequence and infer their biological functions. Traditional wet-lab experimental methods still rely on extensive efforts for functional verification. However, early bioinformatics algorithms and software primarily employed shallow learning techniques; thus, the ability to characterize data and features learning was limited. With the widespread adoption of RNA-Seq technology, scientists from the biological community began to harness the potential of machine learning and deep learning approaches for gene structure prediction and functional annotation. In this context, we reviewed both conventional methods and contemporary deep learning frameworks, and highlighted novel perspectives on the challenges arising during annotation underscoring the dynamic nature of this evolving scientific landscape.

摘要

在人类基因组计划取得里程碑式的成功之后，“DNA 元件百科全书（ENCODE）”计划于 2003 年启动，旨在挖掘基因组中众多功能元件的信息。这一努力恰逢许多新技术的出现，同时提供了大量的全基因组序列、高通量数据，如 ChIP-Seq 和 RNA-Seq。从这个庞大的数据集提取有生物学意义的信息已成为许多近期研究的关键方面，特别是在注释和预测未知基因的功能方面。基因组注释的核心思想是识别基因组序列中的基因和各种功能元件，并推断它们的生物学功能。传统的湿实验方法仍然需要大量的工作来进行功能验证。然而，早期的生物信息学算法和软件主要采用浅层学习技术，因此，数据和特征学习的能力有限。随着 RNA-Seq 技术的广泛采用，来自生物学界的科学家开始利用机器学习和深度学习方法进行基因结构预测和功能注释。在这种背景下，我们回顾了传统方法和当代深度学习框架，并强调了注释过程中出现的新挑战，突出了这个不断发展的科学领域的动态性质。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e8e/10998533/d8db50346146/bbae138f1.jpg

相似文献

From tradition to innovation: conventional and deep learning frameworks in genome annotation.

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae138.

Short-Term Memory Impairment

SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.

J Bioinform Comput Biol. 2024 Oct;22(5):2450022. doi: 10.1142/S0219720024500227. Epub 2024 Oct 1.

mEMbrain: an interactive deep learning MATLAB tool for connectomic segmentation on commodity desktops.

Front Neural Circuits. 2023 Jun 15;17:952921. doi: 10.3389/fncir.2023.952921. eCollection 2023.

Deep Genomics: Deep Learning-Based Analysis of Genome-Sequenced Data for Identification of Gene Alterations.

Methods Mol Biol. 2025;2952:335-367. doi: 10.1007/978-1-0716-4690-8_20.

Predicting cognitive decline: Deep-learning reveals subtle brain changes in pre-MCI stage.

J Prev Alzheimers Dis. 2025 May;12(5):100079. doi: 10.1016/j.tjpad.2025.100079. Epub 2025 Feb 6.

Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.

Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.

scFTAT: a novel cell annotation method integrating FFT and transformer.

BMC Bioinformatics. 2025 Feb 25;26(1):62. doi: 10.1186/s12859-025-06061-z.

Multimodal zero-shot learning of previously unseen epitranscriptomes from RNA-seq data.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf332.

Sexual Harassment and Prevention Training

引用本文的文献

Predicting bacterial phenotypic traits through improved machine learning using high-quality, curated datasets.

Commun Biol. 2025 Jun 7;8(1):897. doi: 10.1038/s42003-025-08313-3.

Research Progress of Genomics Applications in Secondary Metabolites of Medicinal Plants: A Case Study in Safflower.

Int J Mol Sci. 2025 Apr 19;26(8):3867. doi: 10.3390/ijms26083867.

Identification of key genes and immune infiltration of diabetic peripheral neuropathy in mice and humans based on bioinformatics analysis.

Front Endocrinol (Lausanne). 2024 Nov 18;15:1437979. doi: 10.3389/fendo.2024.1437979. eCollection 2024.

Navigating the archaeal frontier: insights and projections from bioinformatic pipelines.

Front Microbiol. 2024 Sep 23;15:1433224. doi: 10.3389/fmicb.2024.1433224. eCollection 2024.

本文引用的文献

Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes.

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac511.

DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants.

Mol Plant. 2023 Jan 2;16(1):279-293. doi: 10.1016/j.molp.2022.11.004. Epub 2022 Nov 10.

SVision: a deep learning approach to resolve complex structural variants.

Nat Methods. 2022 Oct;19(10):1230-1233. doi: 10.1038/s41592-022-01609-w. Epub 2022 Sep 16.

Deep learning predicts DNA methylation regulatory variants in the human brain and elucidates the genetics of psychiatric disorders.

Proc Natl Acad Sci U S A. 2022 Aug 23;119(34):e2206069119. doi: 10.1073/pnas.2206069119. Epub 2022 Aug 15.

Splice-site identification for exon prediction using bidirectional LSTM-RNN approach.

Biochem Biophys Rep. 2022 May 26;30:101285. doi: 10.1016/j.bbrep.2022.101285. eCollection 2022 Jul.

A successful hybrid deep learning model aiming at promoter identification.

BMC Bioinformatics. 2022 May 31;23(Suppl 1):206. doi: 10.1186/s12859-022-04735-6.

The evolution, evolvability and engineering of gene regulatory DNA.

Nature. 2022 Mar;603(7901):455-463. doi: 10.1038/s41586-022-04506-6. Epub 2022 Mar 9.

TERL: classification of transposable elements by convolutional neural networks.

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa185.

Modeling transcriptional regulation of model species with deep learning.

Genome Res. 2021 Jun;31(6):1097-1105. doi: 10.1101/gr.266171.120. Epub 2021 Apr 22.

DeFusion: a denoised network regularization framework for multi-omics integration.

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab057.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从传统到创新：基因组注释中的常规和深度学习框架。

From tradition to innovation: conventional and deep learning frameworks in genome annotation.

机构信息

College of Biomedical Engineering, Taiyuan University of Technology, Jinzhong 030600, China.

出版信息

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae138.

DOI:10.1093/bib/bbae138

PMID:38581418

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10998533/

Abstract

摘要

从传统到创新：基因组注释中的常规和深度学习框架。

From tradition to innovation: conventional and deep learning frameworks in genome annotation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

从传统到创新：基因组注释中的常规和深度学习框架。

From tradition to innovation: conventional and deep learning frameworks in genome annotation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献