Suppr超能文献

YAMDA:使用深度学习库和 GPU 将基于 EM 的 motif 发现速度提高 1000 倍。

YAMDA: thousandfold speedup of EM-based motif discovery using deep learning libraries and GPU.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.

Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.

出版信息

Bioinformatics. 2018 Oct 15;34(20):3578-3580. doi: 10.1093/bioinformatics/bty396.

Abstract

MOTIVATION

Motif discovery in large biopolymer sequence datasets can be computationally demanding, presenting significant challenges for discovery in omics research. MEME, arguably one of the most popular motif discovery software, takes quadratic time with respect to dataset size, leading to excessively long runtimes for large datasets. Therefore, there is a demand for fast programs that can generate results of the same quality as MEME.

RESULTS

Here we describe YAMDA, a highly scalable motif discovery software package. It is built on Pytorch, a tensor computation deep learning library with strong GPU acceleration that is highly optimized for tensor operations that are also useful for motifs. YAMDA takes linear time to find motifs as accurately as MEME, completing in seconds or minutes, which translates to speedups over a thousandfold.

AVAILABILITY AND IMPLEMENTATION

YAMDA is freely available on Github (https://github.com/daquang/YAMDA).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在大型生物聚合物序列数据集中发现模体在计算上可能很繁琐,这对组学研究中的发现提出了重大挑战。MEME 可以说是最流行的模体发现软件之一,其时间复杂度与数据集的大小成二次方关系,导致对于大型数据集的运行时间过长。因此,需要快速的程序来生成与 MEME 相同质量的结果。

结果

这里我们描述了 YAMDA,这是一个高度可扩展的模体发现软件包。它建立在 Pytorch 之上,Pytorch 是一个张量计算深度学习库,具有强大的 GPU 加速,非常适合用于模体的张量操作。YAMDA 以线性时间准确地找到模体,完成时间在几秒钟或几分钟内,这意味着速度提高了上千倍。

可用性和实现

YAMDA 可在 Github(https://github.com/daquang/YAMDA)上免费获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

3
PconsC4: fast, accurate and hassle-free contact predictions.PconsC4:快速、准确、无麻烦的接触预测。
Bioinformatics. 2019 Aug 1;35(15):2677-2679. doi: 10.1093/bioinformatics/bty1036.
6
STREME: accurate and versatile sequence motif discovery.STREME:准确且通用的序列基序发现。
Bioinformatics. 2021 Sep 29;37(18):2834-2840. doi: 10.1093/bioinformatics/btab203.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验