Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, United States.
Commack High School, Commack, NY 11725, United States.
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae092.
Deep neural networks (DNNs) have been widely applied to predict the molecular functions of the non-coding genome. DNNs are data hungry and thus require many training examples to fit data well. However, functional genomics experiments typically generate limited amounts of data, constrained by the activity levels of the molecular function under study inside the cell. Recently, EvoAug was introduced to train a genomic DNN with evolution-inspired augmentations. EvoAug-trained DNNs have demonstrated improved generalization and interpretability with attribution analysis. However, EvoAug only supports PyTorch-based models, which limits its applications to a broad class of genomic DNNs based in TensorFlow. Here, we extend EvoAug's functionality to TensorFlow in a new package, we call EvoAug-TF. Through a systematic benchmark, we find that EvoAug-TF yields comparable performance with the original EvoAug package.
EvoAug-TF is freely available for users and is distributed under an open-source MIT license. Researchers can access the open-source code on GitHub (https://github.com/p-koo/evoaug-tf). The pre-compiled package is provided via PyPI (https://pypi.org/project/evoaug-tf) with in-depth documentation on ReadTheDocs (https://evoaug-tf.readthedocs.io). The scripts for reproducing the results are available at (https://github.com/p-koo/evoaug-tf_analysis).
深度神经网络 (DNN) 已被广泛应用于预测非编码基因组的分子功能。DNN 对数据的需求量很大,因此需要大量的训练样本来很好地拟合数据。然而,功能基因组学实验通常只能生成有限数量的数据,这受到细胞内所研究分子功能的活性水平的限制。最近,引入了 EvoAug 来使用受进化启发的增强功能训练基因组 DNN。EvoAug 训练的 DNN 通过归因分析显示出了改进的泛化能力和可解释性。然而,EvoAug 仅支持基于 PyTorch 的模型,这限制了其在广泛的基于 TensorFlow 的基因组 DNN 中的应用。在这里,我们在一个新的包中扩展了 EvoAug 的功能,我们称之为 EvoAug-TF。通过系统的基准测试,我们发现 EvoAug-TF 与原始的 EvoAug 包具有相当的性能。
EvoAug-TF 可供用户免费使用,并以开源 MIT 许可证分发。研究人员可以在 GitHub(https://github.com/p-koo/evoaug-tf)上访问开源代码。预编译的包通过 PyPI(https://pypi.org/project/evoaug-tf)提供,并在 ReadTheDocs(https://evoaug-tf.readthedocs.io)上提供详细的文档。重现结果的脚本可在(https://github.com/p-koo/evoaug-tf_analysis)获得。