• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

EvoAug-TF:将受进化启发的基因组深度学习数据增强扩展到TensorFlow。

EvoAug-TF: Extending evolution-inspired data augmentations for genomic deep learning to TensorFlow.

作者信息

Yu Yiyang, Muthukumar Shivani, Koo Peter K

机构信息

Columbia University, New york, NY, USA.

Commack High School, Commack, NY, USA.

出版信息

bioRxiv. 2024 Jan 18:2024.01.17.575961. doi: 10.1101/2024.01.17.575961.

DOI:10.1101/2024.01.17.575961
PMID:38293144
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10827165/
Abstract

UNLABELLED

Deep neural networks (DNNs) have been widely applied to predict the molecular functions of regulatory regions in the non-coding genome. DNNs are data hungry and thus require many training examples to fit data well. However, functional genomics experiments typically generate limited amounts of data, constrained by the activity levels of the molecular function under study inside the cell. Recently, EvoAug was introduced to train a genomic DNN with evolution-inspired augmentations. EvoAug-trained DNNs have demonstrated improved generalization and interpretability with attribution analysis. However, EvoAug only supports PyTorch-based models, which limits its applications to a broad class of genomic DNNs based in TensorFlow. Here, we extend EvoAug's functionality to TensorFlow in a new package we call EvoAug-TF. Through a systematic benchmark, we find that EvoAug-TF yields comparable performance with the original EvoAug package.

AVAILABILITY

EvoAug-TF is freely available for users and is distributed under an open-source MIT license. Researchers can access the open-source code on GitHub (https://github.com/p-koo/evoaug-tf). The pre-compiled package is provided via PyPI (https://pypi.org/project/evoaug-tf) with in-depth documentation on ReadTheDocs (https://evoaug-tf.readthedocs.io). The scripts for reproducing the results are available at (https://github.com/p-koo/evoaug-tf_analysis).

摘要

未标注

深度神经网络(DNN)已被广泛应用于预测非编码基因组中调控区域的分子功能。DNN对数据需求大,因此需要许多训练示例才能很好地拟合数据。然而,功能基因组学实验通常产生的数据量有限,受到细胞内所研究分子功能的活性水平限制。最近,引入了EvoAug来训练具有进化启发式增强的基因组DNN。经EvoAug训练的DNN通过归因分析展示了更好的泛化能力和可解释性。然而,EvoAug仅支持基于PyTorch的模型,这限制了其在基于TensorFlow的广泛基因组DNN类中的应用。在此,我们在一个名为EvoAug-TF的新软件包中将EvoAug的功能扩展到TensorFlow。通过系统的基准测试,我们发现EvoAug-TF产生的性能与原始EvoAug软件包相当。

可用性

EvoAug-TF可供用户免费使用,并根据开源的麻省理工学院许可证分发。研究人员可以在GitHub(https://github.com/p-koo/evoaug-tf)上访问开源代码。预编译包通过PyPI(https://pypi.org/project/evoaug-tf)提供,并在ReadTheDocs(https://evoaug-tf.readthedocs.io)上提供深入文档。用于重现结果的脚本可在(https://github.com/p-koo/evoaug-tf_analysis)获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7157/10827165/d21bbc118822/nihpp-2024.01.17.575961v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7157/10827165/d21bbc118822/nihpp-2024.01.17.575961v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7157/10827165/d21bbc118822/nihpp-2024.01.17.575961v1-f0001.jpg

相似文献

1
EvoAug-TF: Extending evolution-inspired data augmentations for genomic deep learning to TensorFlow.EvoAug-TF:将受进化启发的基因组深度学习数据增强扩展到TensorFlow。
bioRxiv. 2024 Jan 18:2024.01.17.575961. doi: 10.1101/2024.01.17.575961.
2
EvoAug-TF: extending evolution-inspired data augmentations for genomic deep learning to TensorFlow.EvoAug-TF:将基于进化的基因组深度学习数据增强扩展到 TensorFlow。
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae092.
3
EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations.EvoAug:利用受进化启发的数据增强方法提高基因组深度学习神经网络的泛化能力和可解释性。
Genome Biol. 2023 May 5;24(1):105. doi: 10.1186/s13059-023-02941-w.
4
PyHMMER: a Python library binding to HMMER for efficient sequence analysis.PyHMMER:一个绑定到 HMMER 的 Python 库,用于高效的序列分析。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad214.
5
Scbean: a python library for single-cell multi-omics data analysis.Scbean:一个用于单细胞多组学数据分析的 Python 库。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae053.
6
Goldilocks: a tool for identifying genomic regions that are 'just right'.金发姑娘:一种用于识别“恰到好处”的基因组区域的工具。
Bioinformatics. 2016 Jul 1;32(13):2047-9. doi: 10.1093/bioinformatics/btw116. Epub 2016 Mar 7.
7
pyInfinityFlow: optimized imputation and analysis of high-dimensional flow cytometry data for millions of cells.pyInfinityFlow:用于对数百万个细胞的高维流式细胞术数据进行优化推断和分析。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad287.
8
Medusa: Software to build and analyze ensembles of genome-scale metabolic network reconstructions.美杜莎:用于构建和分析基因组规模代谢网络重建集合的软件。
PLoS Comput Biol. 2020 Apr 29;16(4):e1007847. doi: 10.1371/journal.pcbi.1007847. eCollection 2020 Apr.
9
HOMELETTE: a unified interface to homology modelling software.HOMELETTE:同源建模软件的统一接口。
Bioinformatics. 2022 Mar 4;38(6):1749-1751. doi: 10.1093/bioinformatics/btab866.
10
easyPheno: An easy-to-use and easy-to-extend Python framework for phenotype prediction using Bayesian optimization.easyPheno:一个易于使用且易于扩展的Python框架,用于使用贝叶斯优化进行表型预测。
Bioinform Adv. 2023 Mar 22;3(1):vbad035. doi: 10.1093/bioadv/vbad035. eCollection 2023.

本文引用的文献

1
Improving the performance of supervised deep learning for regulatory genomics using phylogenetic augmentation.利用系统发育增强提高监管基因组学中监督深度学习的性能。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae190.
2
Evaluating deep learning for predicting epigenomic profiles.评估用于预测表观基因组图谱的深度学习。
Nat Mach Intell. 2022 Dec;4(12):1088-1100. doi: 10.1038/s42256-022-00570-9. Epub 2022 Dec 5.
3
Selecting deep neural networks that yield consistent attribution-based interpretations for genomics.
选择能够对基因组学产生基于归因的一致解释的深度神经网络。
Proc Mach Learn Res. 2022 Nov;200:131-149.
4
Correcting gradient-based interpretations of deep neural networks for genomics.纠正基于梯度的深度学习神经网络在基因组学中的解释。
Genome Biol. 2023 May 9;24(1):109. doi: 10.1186/s13059-023-02956-3.
5
EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations.EvoAug:利用受进化启发的数据增强方法提高基因组深度学习神经网络的泛化能力和可解释性。
Genome Biol. 2023 May 5;24(1):105. doi: 10.1186/s13059-023-02941-w.
6
Discovering molecular features of intrinsically disordered regions by using evolution for contrastive learning.利用进化进行对比学习来发现无序区域的分子特征。
PLoS Comput Biol. 2022 Jun 29;18(6):e1010238. doi: 10.1371/journal.pcbi.1010238. eCollection 2022 Jun.
7
DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers.DeepSTARR 可根据 DNA 序列预测增强子活性,并能够从头设计合成增强子。
Nat Genet. 2022 May;54(5):613-624. doi: 10.1038/s41588-022-01048-5. Epub 2022 May 12.
8
Effective gene expression prediction from sequence by integrating long-range interactions.通过整合长程相互作用,从序列中有效预测基因表达。
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.
9
Base-resolution models of transcription-factor binding reveal soft motif syntax.基于分辨率的转录因子结合模型揭示了软基序语法。
Nat Genet. 2021 Mar;53(3):354-366. doi: 10.1038/s41588-021-00782-6. Epub 2021 Feb 18.
10
Deep learning for inferring transcription factor binding sites.用于推断转录因子结合位点的深度学习
Curr Opin Syst Biol. 2020 Feb;19:16-23. doi: 10.1016/j.coisb.2020.04.001. Epub 2020 Jun 11.