Yu Yiyang, Muthukumar Shivani, Koo Peter K
Columbia University, New york, NY, USA.
Commack High School, Commack, NY, USA.
bioRxiv. 2024 Jan 18:2024.01.17.575961. doi: 10.1101/2024.01.17.575961.
Deep neural networks (DNNs) have been widely applied to predict the molecular functions of regulatory regions in the non-coding genome. DNNs are data hungry and thus require many training examples to fit data well. However, functional genomics experiments typically generate limited amounts of data, constrained by the activity levels of the molecular function under study inside the cell. Recently, EvoAug was introduced to train a genomic DNN with evolution-inspired augmentations. EvoAug-trained DNNs have demonstrated improved generalization and interpretability with attribution analysis. However, EvoAug only supports PyTorch-based models, which limits its applications to a broad class of genomic DNNs based in TensorFlow. Here, we extend EvoAug's functionality to TensorFlow in a new package we call EvoAug-TF. Through a systematic benchmark, we find that EvoAug-TF yields comparable performance with the original EvoAug package.
EvoAug-TF is freely available for users and is distributed under an open-source MIT license. Researchers can access the open-source code on GitHub (https://github.com/p-koo/evoaug-tf). The pre-compiled package is provided via PyPI (https://pypi.org/project/evoaug-tf) with in-depth documentation on ReadTheDocs (https://evoaug-tf.readthedocs.io). The scripts for reproducing the results are available at (https://github.com/p-koo/evoaug-tf_analysis).
深度神经网络(DNN)已被广泛应用于预测非编码基因组中调控区域的分子功能。DNN对数据需求大,因此需要许多训练示例才能很好地拟合数据。然而,功能基因组学实验通常产生的数据量有限,受到细胞内所研究分子功能的活性水平限制。最近,引入了EvoAug来训练具有进化启发式增强的基因组DNN。经EvoAug训练的DNN通过归因分析展示了更好的泛化能力和可解释性。然而,EvoAug仅支持基于PyTorch的模型,这限制了其在基于TensorFlow的广泛基因组DNN类中的应用。在此,我们在一个名为EvoAug-TF的新软件包中将EvoAug的功能扩展到TensorFlow。通过系统的基准测试,我们发现EvoAug-TF产生的性能与原始EvoAug软件包相当。
EvoAug-TF可供用户免费使用,并根据开源的麻省理工学院许可证分发。研究人员可以在GitHub(https://github.com/p-koo/evoaug-tf)上访问开源代码。预编译包通过PyPI(https://pypi.org/project/evoaug-tf)提供,并在ReadTheDocs(https://evoaug-tf.readthedocs.io)上提供深入文档。用于重现结果的脚本可在(https://github.com/p-koo/evoaug-tf_analysis)获取。