• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LOGO小麦:基于深度学习预测小麦中非编码变异的调控效应

LOGOWheat: deep learning-based prediction of regulatory effects for noncoding variants in wheats.

作者信息

Kong Lingpeng, Cheng Hong, Zhu Kun, Song Bo

机构信息

Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, No. 97 Buxin Road, Dapeng New District, Shenzhen 518124, China.

State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, No. 379 Mingli Road (North Section), Zhengzhou 450046, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae705.

DOI:10.1093/bib/bbae705
PMID:39789857
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11717721/
Abstract

Identifying the regulatory effects of noncoding variants presents a significant challenge. Recently, the accumulation of epigenomic profiling data in wheat has provided an opportunity to model the functional impacts of these variants. In this study, we introduce Language of Genome for Wheat (LOGOWheat), a deep learning-based tool designed to predict the regulatory effects of noncoding variants in wheat. LOGOWheat initially employs a self-attention-based, contextualized pretrained language model to acquire bidirectional representations of the unlabeled wheat reference genome. Epigenomic profiling data are also collected and utilized to fine-tune the model, enabling it to discern the regulatory code inherent in genomic sequences. The test results suggest that LOGOWheat is highly effective in predicting multiple chromatin features, achieving an average area under the receiver operating characteristic (AUROC) of 0.8531 and an average area under the precision-recall curve (AUPRC) of 0.7633. Two case studies illustrate and demonstrate the main functions provided by LOGOWheat: assigning scores and prioritizing causal variants within a given variant set and constructing a saturated mutagenesis map in silico to discover high-impact sites or functional motifs in a given sequence. Finally, we propose the concept of extracting potential functional variations from the wheat population by integrating evolutionary conservation information. LOGOWheat is available at http://logowheat.cn/.

摘要

识别非编码变异的调控效应是一项重大挑战。最近,小麦表观基因组图谱数据的积累为模拟这些变异的功能影响提供了契机。在本研究中,我们引入了小麦基因组语言模型(LOGOWheat),这是一种基于深度学习的工具,旨在预测小麦中非编码变异的调控效应。LOGOWheat最初采用基于自注意力的上下文预训练语言模型来获取未标记小麦参考基因组的双向表示。还收集并利用表观基因组图谱数据对模型进行微调,使其能够识别基因组序列中固有的调控密码。测试结果表明,LOGOWheat在预测多种染色质特征方面非常有效,在受试者工作特征曲线下面积(AUROC)的平均值为0.8531,精确召回率曲线下面积(AUPRC)的平均值为0.7633。两个案例研究说明了并展示了LOGOWheat提供的主要功能:在给定变异集中分配分数并对因果变异进行优先级排序,以及在计算机上构建饱和诱变图谱以发现给定序列中的高影响位点或功能基序。最后,我们提出了通过整合进化保守信息从小麦群体中提取潜在功能变异的概念。LOGOWheat可在http://logowheat.cn/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/278bcd9f6e16/bbae705f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/c9de467380c3/bbae705f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/943be52c9dd3/bbae705f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/4cb9d0a26167/bbae705f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/3faff0329d0e/bbae705f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/dfb6bb9094ca/bbae705f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/278bcd9f6e16/bbae705f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/c9de467380c3/bbae705f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/943be52c9dd3/bbae705f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/4cb9d0a26167/bbae705f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/3faff0329d0e/bbae705f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/dfb6bb9094ca/bbae705f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f66d/11717721/278bcd9f6e16/bbae705f6.jpg

相似文献

1
LOGOWheat: deep learning-based prediction of regulatory effects for noncoding variants in wheats.LOGO小麦:基于深度学习预测小麦中非编码变异的调控效应
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae705.
2
PlantDeepSEA, a deep learning-based web service to predict the regulatory effects of genomic variants in plants.PlantDeepSEA,一个基于深度学习的网络服务,用于预测植物基因组变异的调控效应。
Nucleic Acids Res. 2021 Jul 2;49(W1):W523-W529. doi: 10.1093/nar/gkab383.
3
WEVar: a novel statistical learning framework for predicting noncoding regulatory variants.WEVar:一种用于预测非编码调控变异的新型统计学习框架。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab189.
4
The bread wheat epigenomic map reveals distinct chromatin architectural and evolutionary features of functional genetic elements.面包小麦表观基因组图谱揭示了功能遗传元件的独特染色质结构和进化特征。
Genome Biol. 2019 Jul 15;20(1):139. doi: 10.1186/s13059-019-1746-8.
5
Predicting effects of noncoding variants with deep learning-based sequence model.使用基于深度学习的序列模型预测非编码变异的影响。
Nat Methods. 2015 Oct;12(10):931-4. doi: 10.1038/nmeth.3547. Epub 2015 Aug 24.
6
AIKYATAN: mapping distal regulatory elements using convolutional learning on GPU.AIKYATAN:使用 GPU 上的卷积学习进行远端调控元件的作图。
BMC Bioinformatics. 2019 Oct 7;20(1):488. doi: 10.1186/s12859-019-3049-1.
7
RicENN: Prediction of Rice Enhancers with Neural Network Based on DNA Sequences.RicENN:基于 DNA 序列的水稻增强子神经网络预测。
Interdiscip Sci. 2022 Jun;14(2):555-565. doi: 10.1007/s12539-022-00503-5. Epub 2022 Feb 21.
8
Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations.使用密集表观基因组映射卷积神经网络模型预测调控变异,阐明了性状-组织关联的分子基础。
Nucleic Acids Res. 2021 Jan 11;49(1):53-66. doi: 10.1093/nar/gkaa1137.
9
SnpHub: an easy-to-set-up web server framework for exploring large-scale genomic variation data in the post-genomic era with applications in wheat.SnpHub:一个易于搭建的 Web 服务器框架,用于在后基因组时代探索大规模基因组变异数据,在小麦中有应用。
Gigascience. 2020 Jun 1;9(6). doi: 10.1093/gigascience/giaa060.
10
Predicting Postoperative Mortality With Deep Neural Networks and Natural Language Processing: Model Development and Validation.使用深度神经网络和自然语言处理预测术后死亡率:模型开发与验证
JMIR Med Inform. 2022 May 10;10(5):e38241. doi: 10.2196/38241.

本文引用的文献

1
Harnessing landrace diversity empowers wheat breeding.利用地方品种多样性赋予小麦育种力量。
Nature. 2024 Aug;632(8026):823-831. doi: 10.1038/s41586-024-07682-9. Epub 2024 Jun 17.
2
Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp.使用fastp进行超快速单通道FASTQ数据预处理、质量控制和重复数据删除。
Imeta. 2023 May 8;2(2):e107. doi: 10.1002/imt2.107. eCollection 2023 May.
3
Using sequences of life-events to predict human lives.利用生命事件序列预测人类生命。
Nat Comput Sci. 2024 Jan;4(1):43-56. doi: 10.1038/s43588-023-00573-5. Epub 2023 Dec 18.
4
Epigenetic modifications regulate cultivar-specific root development and metabolic adaptation to nitrogen availability in wheat.表观遗传修饰调控小麦品种特异性根发育和对氮可用性的代谢适应。
Nat Commun. 2023 Dec 12;14(1):8238. doi: 10.1038/s41467-023-44003-6.
5
Chromatin accessibility landscapes revealed the subgenome-divergent regulation networks during wheat grain development.染色质可及性图谱揭示了小麦籽粒发育过程中的亚基因组差异调控网络。
aBIOTECH. 2023 Feb 10;4(1):8-19. doi: 10.1007/s42994-023-00095-8. eCollection 2023 Mar.
6
Dynamic chromatin regulatory programs during embryogenesis of hexaploid wheat.六倍体小麦胚胎发生过程中的动态染色质调控程序。
Genome Biol. 2023 Jan 13;24(1):7. doi: 10.1186/s13059-022-02844-2.
7
A sequence-based global map of regulatory activity for deciphering human genetics.基于序列的人类遗传学解码调控活性的全局图谱。
Nat Genet. 2022 Jul;54(7):940-949. doi: 10.1038/s41588-022-01102-2. Epub 2022 Jul 11.
8
Integrating convolution and self-attention improves language model of human genome for interpreting non-coding regions at base-resolution.卷积和自注意力的融合提高了人类基因组语言模型,以碱基分辨率解释非编码区域。
Nucleic Acids Res. 2022 Aug 12;50(14):e81. doi: 10.1093/nar/gkac326.
9
Open chromatin interaction maps reveal functional regulatory elements and chromatin architecture variations during wheat evolution.开放染色质互作图谱揭示了小麦进化过程中功能调控元件和染色质结构变异。
Genome Biol. 2022 Jan 24;23(1):34. doi: 10.1186/s13059-022-02611-3.
10
Mutation bias reflects natural selection in Arabidopsis thaliana.突变偏向反映了拟南芥中的自然选择。
Nature. 2022 Feb;602(7895):101-105. doi: 10.1038/s41586-021-04269-6. Epub 2022 Jan 12.