Suppr超能文献

FastMotif:光谱序列基序发现

FastMotif: spectral sequence motif discovery.

作者信息

Colombo Nicoló, Vlassis Nikos

机构信息

Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg and.

Adobe Research, San Jose, CA, USA.

出版信息

Bioinformatics. 2015 Aug 15;31(16):2623-31. doi: 10.1093/bioinformatics/btv208. Epub 2015 Apr 16.

Abstract

MOTIVATION

Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies.

RESULTS

We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm's robustness and discuss its sensitivity with respect to the free parameters.

AVAILABILITY AND IMPLEMENTATION

The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics.

CONTACT

vlassis@adobe.com

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

序列发现工具在计算生物学的多个领域中发挥着核心作用。在转录因子结合研究的框架内,大多数现有的基序发现算法对计算要求很高,并且可能无法支持现代高通量测序技术产生的日益庞大的数据集。

结果

我们提出了FastMotif,这是一种基于最近被称为矩量法的机器学习技术构建的新基序发现算法。基于谱分解,我们的方法对模型误设具有鲁棒性,并且不易陷入局部最优解。我们得到了一种极其快速且专为分析大型测序数据而设计的算法。在HT - Selex数据上,FastMotif提取的基序轮廓与各种先进算法计算出的轮廓相匹配,但速度快一个数量级。我们对该算法的鲁棒性进行了理论和数值分析,并讨论了其对自由参数的敏感性。

可用性与实现

FastMotif的Matlab代码可从http://lcsb - portal.uni.lu/bioinformatics获取。

联系方式

vlassis@adobe.com

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验