• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CrepHAN:利用分层注意力网络进行增强子的跨物种预测。

CrepHAN: cross-species prediction of enhancers by using hierarchical attention networks.

作者信息

Hong Jianwei, Gao Ruitian, Yang Yang

机构信息

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.

School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China.

出版信息

Bioinformatics. 2021 Oct 25;37(20):3436-3443. doi: 10.1093/bioinformatics/btab349.

DOI:10.1093/bioinformatics/btab349
PMID:33978703
Abstract

MOTIVATION

Enhancers are important functional elements in genome sequences. The identification of enhancers is a very challenging task due to the great diversity of enhancer sequences and the flexible localization on genomes. Till now, the interactions between enhancers and genes have not been fully understood yet. To speed up the studies of the regulatory roles of enhancers, computational tools for the prediction of enhancers have emerged in recent years. Especially, thanks to the ENCODE project and the advances of high-throughput experimental techniques, a large amount of experimentally verified enhancers have been annotated on the human genome, which allows large-scale predictions of unknown enhancers using data-driven methods. However, except for human and some model organisms, the validated enhancer annotations are scarce for most species, leading to more difficulties in the computational identification of enhancers for their genomes.

RESULTS

In this study, we propose a deep learning-based predictor for enhancers, named CrepHAN, which is featured by a hierarchical attention neural network and word embedding-based representations for DNA sequences. We use the experimentally supported data of the human genome to train the model, and perform experiments on human and other mammals, including mouse, cow and dog. The experimental results show that CrepHAN has more advantages on cross-species predictions, and outperforms the existing models by a large margin. Especially, for human-mouse cross-predictions, the area under the receiver operating characteristic (ROC) curve (AUC) score of ROC curve is increased by 0.033∼0.145 on the combined tissue dataset and 0.032∼0.109 on tissue-specific datasets.

AVAILABILITY AND IMPLEMENTATION

bcmi.sjtu.edu.cn/∼yangyang/CrepHAN.html.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

增强子是基因组序列中的重要功能元件。由于增强子序列的高度多样性以及在基因组上的灵活定位,增强子的识别是一项极具挑战性的任务。到目前为止,增强子与基因之间的相互作用尚未完全被理解。为了加速对增强子调控作用的研究,近年来出现了用于预测增强子的计算工具。特别是,得益于ENCODE项目和高通量实验技术的进步,大量经过实验验证的增强子已在人类基因组上进行了注释,这使得使用数据驱动方法对未知增强子进行大规模预测成为可能。然而,除了人类和一些模式生物外,大多数物种的经过验证的增强子注释稀缺,这使得对其基因组增强子进行计算识别更加困难。

结果

在本研究中,我们提出了一种基于深度学习的增强子预测器,名为CrepHAN,其特点是具有分层注意力神经网络和基于词嵌入的DNA序列表示。我们使用人类基因组的实验支持数据来训练模型,并在人类和其他哺乳动物(包括小鼠、牛和狗)上进行实验。实验结果表明,CrepHAN在跨物种预测方面具有更多优势,并且在很大程度上优于现有模型。特别是,对于人类 - 小鼠的交叉预测,在组合组织数据集上,受试者操作特征(ROC)曲线下面积(AUC)得分提高了0.033∼0.145,在组织特异性数据集上提高了0.032∼0.109。

可用性和实现方式

bcmi.sjtu.edu.cn/∼yangyang/CrepHAN.html。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
CrepHAN: cross-species prediction of enhancers by using hierarchical attention networks.CrepHAN:利用分层注意力网络进行增强子的跨物种预测。
Bioinformatics. 2021 Oct 25;37(20):3436-3443. doi: 10.1093/bioinformatics/btab349.
2
RicENN: Prediction of Rice Enhancers with Neural Network Based on DNA Sequences.RicENN:基于 DNA 序列的水稻增强子神经网络预测。
Interdiscip Sci. 2022 Jun;14(2):555-565. doi: 10.1007/s12539-022-00503-5. Epub 2022 Feb 21.
3
Cross-species enhancer prediction using machine learning.基于机器学习的跨物种增强子预测。
Genomics. 2022 Sep;114(5):110454. doi: 10.1016/j.ygeno.2022.110454. Epub 2022 Aug 25.
4
5
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
6
DECODE: a Deep-learning framework for Condensing enhancers and refining boundaries with large-scale functional assays.DECODE:一种利用大规模功能测定法浓缩增强子并精调边界的深度学习框架。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i280-i288. doi: 10.1093/bioinformatics/btab283.
7
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.比人:仅使用 DNA 序列通过深度学习模型预测增强子。
Bioinformatics. 2017 Jul 1;33(13):1930-1936. doi: 10.1093/bioinformatics/btx105.
8
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
9
A deep learning framework for enhancer prediction using word embedding and sequence generation.一种使用词嵌入和序列生成进行增强子预测的深度学习框架。
Biophys Chem. 2022 Jul;286:106822. doi: 10.1016/j.bpc.2022.106822. Epub 2022 May 5.
10
Predicting enhancers with deep convolutional neural networks.使用深度卷积神经网络预测增强子。
BMC Bioinformatics. 2017 Dec 1;18(Suppl 13):478. doi: 10.1186/s12859-017-1878-3.

引用本文的文献

1
Systematic representation and optimization enable the inverse design of cross-species regulatory sequences in bacteria.系统的表征和优化能够实现细菌中跨物种调控序列的逆向设计。
Nat Commun. 2025 Feb 19;16(1):1763. doi: 10.1038/s41467-025-57031-1.
2
Predmoter-cross-species prediction of plant promoter and enhancer regions.植物启动子和增强子区域的启动子跨物种预测
Bioinform Adv. 2024 May 24;4(1):vbae074. doi: 10.1093/bioadv/vbae074. eCollection 2024.
3
A novel method for identifying key genes in macroevolution based on deep learning with attention mechanism.
基于深度学习注意力机制的宏观进化中关键基因识别的新方法。
Sci Rep. 2023 Nov 13;13(1):19727. doi: 10.1038/s41598-023-47113-9.
4
Optimizing Hyperparameter Tuning in Machine Learning to Improve the Predictive Performance of Cross-Species N6-Methyladenosine Sites.优化机器学习中的超参数调整以提高跨物种N6-甲基腺嘌呤位点的预测性能。
ACS Omega. 2023 Oct 13;8(42):39420-39426. doi: 10.1021/acsomega.3c05074. eCollection 2023 Oct 24.
5
iEnhancer-DCSA: identifying enhancers via dual-scale convolution and spatial attention.iEnhancer-DCSA:通过双尺度卷积和空间注意力识别增强子。
BMC Genomics. 2023 Jul 13;24(1):393. doi: 10.1186/s12864-023-09468-1.
6
From shallow to deep: some lessons learned from application of machine learning for recognition of functional genomic elements in human genome.从浅入深:机器学习在人类基因组功能基因组元件识别应用中的一些经验教训。
Hum Genomics. 2022 Feb 18;16(1):7. doi: 10.1186/s40246-022-00376-1.
7
Comprehensive Genomic Discovery of Non-Coding Transcriptional Enhancers in the African Malaria Vector .非洲疟疾媒介中非编码转录增强子的综合基因组发现
Front Genet. 2022 Jan 10;12:785934. doi: 10.3389/fgene.2021.785934. eCollection 2021.